Using NumPy with Rasters

In addition to converting feature classes in to NumPy arrays, we can also convert entire raster datasets into 2-dimensional arrays. This allows us, as we'll see below, to programmatically extract values from these rasters, or we can integrate these arrays with other packages to perform custom analyses with the data.

For more information on this topic see: https://4326.us/esri/scipy/devsummit-2016-scipy-arcgis-presentation-handout.pdf


In [ ]:
# Import the modules
import arcpy
import numpy as np

In [ ]:
#Set the name of the file we'll import
demFilename = '../Data/DEM.tif'

In [ ]:
#Import the DEM as a NumPy array, using only a 200 x 200 pixel subset of it
demRaster = arcpy.RasterToNumPyArray(demFilename)

In [ ]:
#What is the shape of the raster (i.e. the # of rows and columns)? 
demRaster.shape
#...note it's a 2d array

In [ ]:
#Here, we re-import the raster, this time only grabbing a 150 x 150 pixel subset - to speed processing
demRaster = arcpy.RasterToNumPyArray(demFilename,'',150,150)

In [ ]:
#What is the pixel type?
demRaster.dtype

As 2-dimensional NumPy array, we can perform rapid stats on the pixel values. We can do this for all the pixels at once, or we can perform stats on specific pixels or subsets of pixels, using their row and columns to identify the subsets.


In [ ]:
#Get the value of a specific pixel, here one at the 100th row and 50th column
demRaster[99,49]

In [ ]:
#Compute the min, max and mean value of all pixels in the 200 x 200 DEM snipped
print "Min:", demRaster.min()
print "Max:", demRaster.max()
print "Mean:", demRaster.mean()

To define subsets, we use the same slicing techniques we use for lists and strings, but as this is a 2 dimensional array, we provide two sets of slices; the first slices will be the rows and the second for the columns. We can use a : to select all rows or columns.


In [ ]:
#Get the max for the 10th column of pixels, 
# (using `:` to select all rows and `9` to select just the 10th column)
print(demRaster[:,9].max())

In [ ]:
#Get the mean for the first 10 rows of pixels, selecting all columns
# (We can put a `:` as the second slice, or leave it blank after the comma...)
x = demRaster[:10,]
x.shape
x.mean()

The SciPy package has a number of multi-dimensional image processing capabilities (see https://docs.scipy.org/doc/scipy/reference/ndimage.html). Here is a somewhat complex example that runs through 10 iterations of computing a neighborhood mean (using the nd.median_filter) with an incrementally growing neighorhood. We then subtract that neighborhood median elevation from the original elevation to compute Topographic Position Index (TPI, see http://www.jennessent.com/downloads/tpi-poster-tnc_18x22.pdf)

If you don't fully understand how it works, at least appreciate that converting a raster to a NumPy array enables us to use other packages to execute custom analyses on the data.


In [ ]:
#Import the SciPy and plotting packages
import scipy.ndimage as nd
from matplotlib import pyplot as plt

#Allows plots in our Jupyter Notebook
%matplotlib inline

#Create a 'canvas' onto which we can add plots
fig = plt.figure(figsize=(20,20))

#Loop through 10 iterations
for i in xrange(10):
    #Create a kernel, intially 3 x 3, then expanding 3 x 3 at each iteration 
    size = (i+1) * 3
    print size,
    #Compute the median value for the kernel surrounding each pixel
    med = nd.median_filter(demRaster, size)
    #Subtract the median elevation from the original to compute TPI
    tpi = demRaster - med
    #Create a subplot frame
    a = fig.add_subplot(5, 5,i+1)
    #Show the median interpolated DEM
    plt.imshow(tpi, interpolation='nearest')
    #Set titles for the plot
    a.set_title('{}x{}'.format(size, size))
    plt.axis('off')
    plt.subplots_adjust(hspace = 0.1)
    prev = med