Using NumPy with ArcGIS: FeatureClass to Numpy

Demonstrates manipulation of feature class attribute data using Numpy. By no means is this an in-depth introduction, let alone discussion, of NumPy, but it does at least familiarize you with what NumPy is about and how it can be used with ArcGIS feature classes. The links below provide more in-depth reading on NumPy and how it's used with feature classes.

https://jakevdp.github.io/PythonDataScienceHandbook/index.html#2.-Introduction-to-NumPy http://desktop.arcgis.com/en/arcmap/latest/analyze/arcpy-data-access/featureclasstonumpyarray.htm


In [ ]:
#Import arcpy and numpy
import arcpy
import numpy as np

In [ ]:
#Point to the HUC12.shp feature class in the Data folder
huc12_fc = '../Data/HUC12.shp'
print arcpy.Exists(huc12_fc)

Here,we convert the feature class to a NumPy array using ArcPy's FeatureClassToNumPyArray function


In [ ]:
#List the fields we want to convert
fieldList = ["SHAPE@XY","HUC_8","HUC_12","ACRES"]
arrHUCS = arcpy.da.FeatureClassToNumPyArray(huc12_fc,fieldList)

As a NumPy array, we can do different operations on the feature class. But first, let's inspect the array's properties.


In [ ]:
#What is the type of the arrHUCs variable and how many records does it contain
print type(arrHUCS)
print arrHUCS.size

In [ ]:
#What are the data types stored in this array
print arrHUCS.dtype

In [ ]:
#Or, just what are the names of the "columns"
print arrHUCS.dtype.names

In [ ]:
#Show the first row of data
print arrHUCS[0]

In [ ]:
#Show the first 5 rows of data
print arrHUCS[0:5]

In [ ]:
#Show the HUC8 value of the 5th row
print arrHUCS[4]['HUC_8']

In [ ]:
#List all the HUC12s
print arrHUCS['HUC_12']

In [ ]:
#List the mean area of all HUCs
print arrHUCS['ACRES'].mean()

We can also subset records in our array which we will do as a two step process. First we create a boolean mask array, that is an array of true and false values where a record is true if a condition is met. Then we apply this mask to our original array to isolate records where the mask is true


In [ ]:
#First we make a boolean mask and show the first 10 records
arrMask = (arrHUCS["HUC_8"] == '03040103')
arrMask[:10]

In [ ]:
#Now we apply the mask to isolate record where this is true
arrSelectedHUC8 = arrHUCS[arrMask]
#The original array had 201 records, how many records does this have? 
print arrSelectedHUC8.size

In [ ]:
#Print the first 10 rows
arrSelectedHUC8[10]

In [ ]:
#Calculate the mean area of these HUCs
arrSelectedHUC8['ACRES'].mean()

In [ ]:
#Plot a historam of HUC_12 areas
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn; seaborn.set()  # set plot style

In [ ]:
plt.hist(arrHUCS['ACRES']);
plt.title('Area Distribution of HUC_12s')
plt.xlabel('Area (acres)')
plt.ylabel('number');