Python GeoPandas appetizer
Introduction to GIS scripting
May, 2017© 2017, Stijn Van Hoey (mailto:stijnvanhoey@gmail.com). Licensed under CC BY 4.0 Creative Commons
In [4]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
In this notebook, we will explore some of the functionalities of GeoPandas as a first introduction to how we can build on top of the Python standard library and the scientific Python ecosystem.
Some new functionalities are required for this showcase (but will be discussed later in more detail):
In [5]:
import mplleaflet
import geopandas as gpd
Similar to reading in csv files with Pandas, geoPandas provides reading methods/functions to read GIS-formats from file:
In [6]:
deelbekkens = gpd.read_file("../data/deelbekkens/Deelbekken.shp")
In [7]:
deelbekkens.head()
Out[7]:
We can check the geometry information in more detail, to see if we are dealing with POINT, LINE or POLYGON data:
In [8]:
deelbekkens.geometry.head()
Out[8]:
In [9]:
deelbekkens.plot()
Out[9]:
Selecting the Nete from this shapefile (any BEKNAAM that contains Nete as name), just as we would do in Pandas:
In [10]:
nete = deelbekkens[deelbekkens["BEKNAAM"].str.lower().str.contains('nete')].copy()
# Remark: you can off course also use: deelbekkens[deelbekkens["BEKNAAM"] == "Netebekken"]
nete.plot()
Out[10]:
Getting the current projection information
In [11]:
nete.crs
Out[11]:
Converting to WGS84, which is represented by the EPSG code 4326 enables us easy combination with open street map backgrounds:
In [12]:
nete_wgs84 = nete.to_crs({'init': 'epsg:4326'}) # method that converts the projection to the given EPSG code
In [13]:
nete_wgs84.crs
Out[13]:
We want to plot the data on an interactive map. For small datasets, a convenient Package, called mplleaflet exists:
In [14]:
fig, ax = plt.subplots()
nete_wgs84.plot(ax=ax)
nete_wgs84.centroid.plot(ax=ax, markersize=10) # Add the centroids of the individual polygons to the plot
mplleaflet.display()
Out[14]:
Some derived spatial attributes are directly available to the object, such as the area of each polygon:
In [15]:
nete_wgs84.area
Out[15]:
Adding a buffer to our polygons is a method available to the geometry of the geoDataFrame:
Add a buffered version of the geometry as an additional column to the data:
In [16]:
nete["buffered_1000"] = nete.geometry.buffer(1000)
In [17]:
nete.head()
Out[17]:
In [18]:
# Plot both of the polygons together:
fig, ax = plt.subplots()
nete.plot(ax=ax)
nete.set_geometry("buffered_1000").plot(ax=ax)
ax.set_aspect('equal', 'datalim')
Let's introduce a second vector file, saved as a Geojson file (filename.geojson) with the EU defined 10K grid cells:
In [19]:
eu_10k_grids = gpd.read_file("../data/EUgrid10.geojson")
eu_10k_grids.crs # the projection is contained in the geojson file
Out[19]:
In [20]:
eu_10k_grids.plot()
Out[20]:
In [21]:
eu_10k_grids.head(5)
Out[21]:
We can make selections (cfr. queries) to our data on some kind of condition:
In [22]:
eu_10k_grids[eu_10k_grids["CellCode"].str.contains('N311')].plot()
mplleaflet.display()
Out[22]:
Using a spatial join, extract those 10K gridcells that intersect with the polygon of the Grote Nete study area (we use the original nete variable):
In [23]:
nete_10k = gpd.sjoin(eu_10k_grids, nete_wgs84, how="inner", op='intersects')
A plot to control the outcome:
In [24]:
fig, ax = plt.subplots()
nete_10k.plot(ax=ax)
nete_wgs84.plot(ax=ax)
mplleaflet.display()
Out[24]:
In [25]:
nete_10k.crs
Out[25]:
Save the result of our analysis as a shapefile:
In [55]:
nete_10k.to_file("../nete_10k_grid.shp") # But maybe a geojson is easier to manage (as it is one file...)
Nice, but.... how?