Demonstrates manipulation of feature class attribute data using Numpy. By no means is this an in-depth introduction, let alone discussion, of NumPy, but it does at least familiarize you with what NumPy is about and how it can be used with ArcGIS feature classes. The links below provide more in-depth reading on NumPy and how it's used with feature classes.
https://jakevdp.github.io/PythonDataScienceHandbook/index.html#2.-Introduction-to-NumPy http://desktop.arcgis.com/en/arcmap/latest/analyze/arcpy-data-access/featureclasstonumpyarray.htm
In [ ]:
#Import arcpy and numpy
import arcpy
import numpy as np
In [ ]:
#Point to the HUC12.shp feature class in the Data folder
huc12_fc = '../Data/HUC12.shp'
print arcpy.Exists(huc12_fc)
Here,we convert the feature class to a NumPy array using ArcPy's FeatureClassToNumPyArray function
In [ ]:
#List the fields we want to convert
fieldList = ["SHAPE@XY","HUC_8","HUC_12","ACRES"]
arrHUCS = arcpy.da.FeatureClassToNumPyArray(huc12_fc,fieldList)
As a NumPy array, we can do different operations on the feature class. But first, let's inspect the array's properties.
In [ ]:
#What is the type of the arrHUCs variable and how many records does it contain
print type(arrHUCS)
print arrHUCS.size
In [ ]:
#What are the data types stored in this array
print arrHUCS.dtype
In [ ]:
#Or, just what are the names of the "columns"
print arrHUCS.dtype.names
In [ ]:
#Show the first row of data
print arrHUCS[0]
In [ ]:
#Show the first 5 rows of data
print arrHUCS[0:5]
In [ ]:
#Show the HUC8 value of the 5th row
print arrHUCS[4]['HUC_8']
In [ ]:
#List all the HUC12s
print arrHUCS['HUC_12']
In [ ]:
#List the mean area of all HUCs
print arrHUCS['ACRES'].mean()
We can also subset records in our array which we will do as a two step process. First we create a boolean mask array, that is an array of true and false values where a record is true if a condition is met. Then we apply this mask to our original array to isolate records where the mask is true
In [ ]:
#First we make a boolean mask and show the first 10 records
arrMask = (arrHUCS["HUC_8"] == '03040103')
arrMask[:10]
In [ ]:
#Now we apply the mask to isolate record where this is true
arrSelectedHUC8 = arrHUCS[arrMask]
#The original array had 201 records, how many records does this have?
print arrSelectedHUC8.size
In [ ]:
#Print the first 10 rows
arrSelectedHUC8[10]
In [ ]:
#Calculate the mean area of these HUCs
arrSelectedHUC8['ACRES'].mean()
In [ ]:
#Plot a historam of HUC_12 areas
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn; seaborn.set() # set plot style
In [ ]:
plt.hist(arrHUCS['ACRES']);
plt.title('Area Distribution of HUC_12s')
plt.xlabel('Area (acres)')
plt.ylabel('number');