Friday, July 27, 2012

Custom Python Script - Batch Raster Extractor

Introduction 


This code was created during a Programming and Development for GIS course at UCLA during Winter 2012. The purpose of the code is intended for research conducted by Dr. Thomas Gillespie of the UCLA Geography Department, however the code can be used for a variety of applications.

Dr. Gillespie's research involves identifying global biodiversity hotspots that have seen significant landcover change. The data set being used is called the GLOBCOVER project and it is created by the European Space Agency (ESA). The data is a satellite imagery raster that has been classified into landcover types. A new data set is created every few years and Gillespie's research goal is to compare eco-regions, countries and protected areas over time to look for forest degradation.

To help solve this research question, this code produces statistics of landcover area (km2) and percentage land cover (%) for polygons (ie eco-regions, countries and protected areas). This is accomplished by using the polygons to extract pixel data from the GLOBCOVER raster. In order to do a temporal comparison, a separate code would be required to draw comparison statistics.




Technical Code Description:
The Batch Raster Extractor is meant to provide a standard and repeatable way to extract raster data using the polygons of a vector file. The script operates in four main steps. First, the user inputs the vector file and raster file of interest. Second, a split by attribute tool is used to export each polygon into a separate shapefile. Third, each polygon is masked on top of the raster and used to extract a subset of the raster. Fourth, statistics are calculated for each raster subset, Area and Percentage fields are appended to the attribute tables of each subset.

Code Requirements:
To use this code, the user must have Python 2.6 for ArcGIS 10 (or above) installed. A custom python script called "Split Layer by Attributes" is used in this code, it must be downloaded for this code to work from this link. User must have licensing for ArcGIS Spatial Analyst Extension.

Code Usage Guide:
  1. Copy and past code below into Python IDLE window and save as a .py. 
  2. Change / modify the ten lines in the code that have a comment on the right hand side in green (indicated by ' ### '  )
  3. Save  [ CTRL + S ]
  4. Run [ F5 ]
Trouble Shooting
  1. If code returns a 'NoneType' error, close all files (including ArcMap, IDLE and shell) and re-run code. (This error is due to the code's use of the custom tool "Split Layer By Attributes"
  2. If code crashes while running, it is possible that polygons exist in the vector file that do not have corresponding raster data to extract. To fix this, manually remove those polygons in ArcMap.

Code broken down into pictures:


Step 1: 

Specify two inputs 

1. Vector Layer
  • example = shapefile of all countries in world


















2. Raster Layer
  • example = GlobCover , European Space Agency (ESA) 




















Step 2: 

Split apart shapes within vector file and save each as a separate shapefile (.shp)




 

Step 3: 

Use geometry of each shape to "cookie cutter" raster and save each subset as a raster file



Step 4: 

Calculate pixel statistics 
  •  will append attribute table with Area (Km2) and Percentage fields (this screen shot is from an earlier iteration of the code when the percentage field was not populating correctly, this bug has since been fixed)



PYTHON CODE


'''BatchRasterExtractor.py

Author:
  Kelsey Kaszas
  Dept of Geography and Environmental Studies
  University of California, Los Angeles, USA
  kelsey.kaszas@gmail.com
  Special Thanks to Jida Wang 

Date created April 9 2012
Modified     July 27 2012

Purpose:
  Converts each shape in a feature class into a separate shapefile.
  Uses individual shapefiles to extract raster data by mask.
  Appends extracted raster files with "Value", "Area", and "Percent" fields. 
  Final output is a folder with masked raster subsets. 

Code Requirements:
  This code requires the user to have Python 2.6 for ArcGIS 10 (or above) installed on computer
  This code uses a custom python script called "Split Layer By Attributes" that can be downloaded from: http://arcscripts.esri.com/details.asp?dbid=14127
  This code requires ArcGIS Spatial Analyst Extension
  
Code Usage Guide [ IMPORTANT ] :
  1. change the ten places in code that have a comment on the right hand side (indicated by ' ### ') 
  2. save code [ CTRL + S ]
  3. run code [ F5 ]
  
Troubleshooting:
  1. If code returns a 'NoneType' error, close all files and re-run code
     (this error is due to the code's use of the custom tool "SplitLayerByAttributes")
  2. If code crashes while running, it is possible that polygons exist in the vector file that do not
     have corresponding raster data to extract. To fix this, manually remove those polygons using ArcMap
     
   
'''
#--------------------------------------------------------------------

#import system module 
import arcpy, os
from arcpy import env
from arcpy.sa import * 

#enable overwrite 
arcpy.env.overwriteOutput = True

#define filepath
filepath = r"D:\My Documents\RASTER_EXTRACTER"                                      ### change path to workspace folder
arcpy.ImportToolbox(r"D:\My Documents\SplitLayerByAttributes.tbx")                  ### change path to location of "Split Layer By Attributes" toolbox .tbx
symbologyLayer = r"D:\My Documents\GlobCover_Legend.lyr"                            ### comment out or delete this line if there is NO symbology .lyr file 


#set parameters 
#vector = arcpy.GetParameterAsText(0)
vector = r"D:\My Documents\vector.shp"                                              ### change path to vector dataset 


#raster = arcpy.GetParameterAsText(1)
raster = r"D:\My Documents\raster.tif"                                              ### change path to raster dataset

#out_vector = arcpy.GetParameterAsText(2)
out_vector = r"D:\My Documents\vector_output"                                       ### change path and change \vector_output to name of folder that will contain individual shapefiles
out_vector = str(out_vector)
os.makedirs(out_vector)

#out_raster = arcpy.GetParameterAsText(3)
out_raster = r"D:\My Documents\raster_output"                                       ### change path and change \raster_output to name of folder that will contain subset raster files                                                                                 
out_raster = str(out_raster)
os.makedirs(out_raster)

#set workspace environment 
arcpy.env.workspace = out_vector

#input pixel size of raster
z = ???                                                                             ### change ??? to pixel size of raster (units in meters) (example: z = 300)                                                           

#[SPLIT]
#_________________________________________________________________________

print("running split...")

#print(out_vector)

name_vector = '???'                                                                 ### choose naming convention: change ??? to the field name from original shapefile that output vector naming convention will be based on
arcpy.SplitLayerByAttributes(vector, name,"_",out_vector)                                                                          
print("split completed")

#[MASK]
#________________________________________________________________________

print("adding symbology layer to original raster file")
arcpy.ApplySymbologyFromLayer_management (raster, symbologyLayer)                   ### Comment out or delete this line of code if there is NO symbology .lyr file


#Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")


print("running mask of shapefile output")
for file1 in os.listdir(out_vector):    
    if file1.endswith(".shp"):
        
        #cursor out name of individual vector outputs for raster file naming convention 
        name = arcpy.SearchCursor(file1)
        for name_1 in name:
            raster_name = name_1.name_vector
            break       

        # Execute ExtractByMask
        outExtractByMask = ExtractByMask (raster, file1)  

        # Save the output
        outExtractByMask.save(out_raster + "/" + str(raster_name))



        print("performing statistics on raster. step 1...get a total pixel count from new raster subset")
        count_list = []
        value_list = []
        searched_rows = arcpy.SearchCursor(outExtractByMask)
        total_count = 0
        for row in searched_rows:
            value_list.append(row.VALUE)
            count_list.append(row.COUNT)
            total_count += row.COUNT

        print(str(total_count))

        
        print("step 2...creating tmp.dbf to serve as intermediary")
        tmpTABLE = "tmp.dbf"
        if os.path.exists(tmpTABLE):                                          
            os.remove(tmpTABLE)                                               
        
        arcpy.CreateTable_management(out_vector, tmpTABLE)

        print("step 3... add fields to tmp")
        #ADD FIELDS to tmp table
        arcpy.AddField_management(tmpTABLE, "VALUE", "LONG")
        arcpy.AddField_management(tmpTABLE, "area", "DOUBLE")
        arcpy.AddField_management(tmpTABLE, "percent", "DOUBLE") 
        print("table created")
        
        #UPDATE the tmp table with calculation in added fields
        new_row = arcpy.InsertCursor(tmpTABLE)
        searched_rows = arcpy.SearchCursor(outExtractByMask)
        index = 0
        print("step 4...update the tmp table with calculation in added fields")
        for searched_row in searched_rows:
    
            #print index
            row = new_row.newRow()
            row.VALUE = value_list[index]
            row.area = z*z*1.0*count_list[index]/1000000                                                                                  
            row.percent = count_list[index]*100.0/total_count
            index = index+1
            new_row.insertRow(row)
    
        del row
        del new_row
        del searched_row
        del searched_rows

        print("step 5 ...join the tmp table to attribute table of the newly subset raster file")         

        #join tables to the attribute table of the raster file
        arcpy.MakeRasterLayer_management (outExtractByMask,  "layerName")     
    
        # Join the feature layer to a table
        arcpy.JoinField_management("layerName", "VALUE", tmpTABLE, "VALUE")       
             
        if os.path.exists(tmpTABLE):                                          
            os.remove(tmpTABLE)


print("extract by mask completed")
code snippet creator: http://hilite.me/