Python GeoPandas appetizer

Introduction to GIS scripting
May, 2017

© 2017, Stijn Van Hoey (mailto:stijnvanhoey@gmail.com). Licensed under CC BY 4.0 Creative Commons


In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In this notebook, we will explore some of the functionalities of GeoPandas as a first introduction to how we can build on top of the Python standard library and the scientific Python ecosystem.

Some new functionalities are required for this showcase (but will be discussed later in more detail):


In [ ]:
import mplleaflet
import geopandas as gpd

Similar to reading in csv files with Pandas, geoPandas provides reading methods/functions to read GIS-formats from file:


In [ ]:
deelbekkens = gpd.read_file("../data/deelbekkens/Deelbekken.shp")

In [ ]:
deelbekkens.head()
REMEMBER:
  • It is like a DataFrame, but with a Geometry column attached to it!

We can check the geometry information in more detail, to see if we are dealing with POINT, LINE or POLYGON data:


In [ ]:
deelbekkens.geometry.head()

In [ ]:
deelbekkens.plot()

Selecting the Nete from this shapefile (any BEKNAAM that contains Nete as name), just as we would do in Pandas:


In [ ]:
nete = deelbekkens[deelbekkens["BEKNAAM"].str.lower().str.contains('nete')].copy()
# Remark: you can off course also use: deelbekkens[deelbekkens["BEKNAAM"] == "Netebekken"]
nete.plot()
REMEMBER:
  • The usage of boolean indexing is just the same as we did it with Numpy/Pandas...

Getting the current projection information


In [ ]:
nete.crs
ATTENTION:
  • As we are dealing with spatial information, the **projection information** is crucial!

Converting to WGS84, which is represented by the EPSG code 4326 enables us easy combination with open street map backgrounds:


In [ ]:
nete_wgs84 = nete.to_crs({'init': 'epsg:4326'}) # method that converts the projection to the given EPSG code

In [ ]:
nete_wgs84.crs

We want to plot the data on an interactive map. For small datasets, a convenient Package, called mplleaflet exists:


In [ ]:
fig, ax = plt.subplots()
nete_wgs84.plot(ax=ax)
nete_wgs84.centroid.plot(ax=ax, markersize=10)  # Add the centroids of the individual polygons to the plot
mplleaflet.display()

Some derived spatial attributes are directly available to the object, such as the area of each polygon:


In [ ]:
nete_wgs84.area

Adding a buffer to our polygons is a method available to the geometry of the geoDataFrame:

Add a buffered version of the geometry as an additional column to the data:


In [ ]:
nete["buffered_1000"] = nete.geometry.buffer(1000)

In [ ]:
nete.head()

In [ ]:
# Plot both of the polygons together:
fig, ax = plt.subplots()
nete.plot(ax=ax)
nete.set_geometry("buffered_1000").plot(ax=ax)
ax.set_aspect('equal', 'datalim')

Let's introduce a second vector file, saved as a Geojson file (filename.geojson) with the EU defined 10K grid cells:


In [ ]:
eu_10k_grids = gpd.read_file("../data/EUgrid10.geojson")
eu_10k_grids.crs # the projection is contained in the geojson file

In [ ]:
eu_10k_grids.plot()

In [ ]:
eu_10k_grids.head(5)

We can make selections (cfr. queries) to our data on some kind of condition:


In [ ]:
eu_10k_grids[eu_10k_grids["CellCode"].str.contains('N311')].plot()
mplleaflet.display()

Using a spatial join, extract those 10K gridcells that intersect with the polygon of the Grote Nete study area (we use the original nete variable):


In [ ]:
nete_10k = gpd.sjoin(eu_10k_grids, nete_wgs84, how="inner", op='intersects')

A plot to control the outcome:


In [ ]:
fig, ax = plt.subplots()
nete_10k.plot(ax=ax)
nete_wgs84.plot(ax=ax)
mplleaflet.display()

In [ ]:
nete_10k.crs

Save the result of our analysis as a shapefile:


In [ ]:
nete_10k.to_file("../nete_10k_grid.shp")  # But maybe a geojson is easier to manage (as it is one file...)

Nice, but.... how?