Geopandas is a Python module that extends the functionalities of Pandas data analysis library (that takes advantage of numpy module). Geopandas makes possible to work with spatial data such as Shapefiles. It is also possible to work with rasters using geopandas with rasterio module (see example).
Spatial data can be read easily with geopandas using gpd.from_file() function:
In [1]:
# Import necessary modules
import geopandas as gpd # geopandas is usually shortened as gpd
# Set filepath
fp = r"C:\HY-Data\HENTENKA\Data\DAMSELFISH_distributions.shp"
# Read file using gpd.read_file()
data = gpd.read_file(fp)
# Let's see what we got --> print the 'head' of the file (first 5 rows)
data.head()
Out[1]:
We can select for example 50 rows of the input data and write those into a new Shapefile by first selecting the data using pandas functionalities and then writing the selection with gpd.to_file() function:
In [2]:
# Create a output path for the data
out = r"C:\HY-Data\HENTENKA\Data\DAMSELFISH_distributions_SELECTION.shp"
# Select first 50 rows
selection = data[0:50]
# Print the head of our selection
print(selection.head())
# Write those rows into a new Shapefile
selection.to_file(out, driver="ESRI Shapefile") # drivers makes it possible to write to different file formats
Geopandas takes advantage of Shapely's geometric objects. Geometries are stored in a column called geometry that is a default column name for storing geometric information in geopandas.
In [3]:
# Print first 5 rows of the column 'geometry'
# -------------------------------------------
# It is possible to use only specific columns by specifying the column name within square brackets []
data['geometry'].head()
Out[3]:
Since spatial data is stored as Shapely objects, it is possible to use all of the functionalities of Shapely module that we practiced earlier:
In [4]:
# Print the areas of the first 5 polygons
# ---------------------------------------
# Make a selection that contains only the first five rows
selection = data[0:5]
# Iterate over the selected rows by using a for loop in a pandas function called '.iterrows()'
for row_num, row in selection.iterrows():
# Calculate the area of the polygon
poly_area = row['geometry'].area
# Print information for the user
print("Polygon area at index %s is: %s" % (row_num, poly_area))
# Outputs:
Since geopandas takes advantage of Shapely geometric objects it is possible to create a Shapefile from a scratch or using e.g. a text file that contains coordinates.
Let's create an empty GeoDataFrame that is a Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes --> i.e. it is a similar thing as attribute table that also contains geometries. GeoDataFrame extends the functionalities of pandas.DataFrame() in a way that it is possible to use spatial data within pandas (hence the name geopandas).
In [2]:
# Import necessary modules first
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point, Polygon
import fiona
# Create an empty geopandas GeoDataFrame
data = gpd.GeoDataFrame()
# Let's see what's inside
print(data)
# Outputs:
The GeoDataFrame is empty since we haven't places any data inside. Let's create a new column called geometry that will contain our Shapely Geometric Object:
In [3]:
# Create a new column called 'geometry' to the GeoDataFrame
data['geometry'] = None
# Let's see what's inside
data
# Outputs:
Out[3]:
Now we have a geometry column in the GeoDataFrame but we don't have any data yet. Let's create a Shapely Polygon that we can insert to our GeoDataFrame:
In [4]:
# Create a Shapely Polygon that has coordinates in WGS84 projection (i.e. lat, lon)
# Coordinates are corner coordinates of Senaatintori in Helsinki
coordinates = [(24.950899, 60.169158), (24.953492, 60.169158), (24.953510, 60.170104), (24.950958, 60.169990)]
# Create a Shapely polygon from the coordinate-tuple list
poly = Polygon(coordinates)
# Let's see that we really have a polygon
print(poly)
# Outputs:
Now we need to insert that polygon into the column 'geometry' of our GeoDataFrame:
In [5]:
# Insert the polygon to the GeoDataFrame
data['geometry'] = [poly]
# Print out the data
data
# Outputs:
Out[5]:
Now we have a GeoDataFrame with Polygon that we can export to a Shapefile. Let's add another column to our GeoDataFrame called Location with text Senaatintori.
In [6]:
# Add a new column and insert data
data['Location'] = ['Senaatintori']
# Print out the data
data
# Outputs
Out[6]:
Before exporting the data it is useful to determine the projection for the GeoDataFrame.
GeoDataFrame has a property called .crs that shows the coordinatesystem of the data which is empty in our case since we are creating the data from the scratch. If you import a Shapefile into geopandas that has a defined projection, the crs of the GeoDataFrame will automatically be the one of that Shapefile.
A Python module called fiona has a nice function called from_epsg for passing coordinate system for the GeoDataFrame. Next we will use that and determine the projection to WGS84 (epsg code: 4326):
In [7]:
# Check the current coordinate system
print(data.crs)
# Outputs:
In [8]:
# Import specific function 'from_epsg' from fiona module
from fiona.crs import from_epsg
# Set the GeoDataFrame's coordinate system to WGS84
data.crs = from_epsg(4326)
# Let's see how the crs definition looks like
data.crs
Out[8]:
Finally, we can export the data using GeoDataFrames .to_file function.
In [9]:
# Determine the output path for the Shapefile
out = r"C:\HY-Data\HENTENKA\Data\Senaatintori.shp"
# Write the data into that location
data.to_file(out)
Now we have successfully created a Shapefile from the scratch using only Python programming. Similar approach can be used to for example to read coordinates from a text file (e.g. points) and create Shapefiles from those automatically.
Following lines of code demonstrate how we can create an interactive map of our Polygon geometry on top of Google Maps using bokeh module. See the result at the end of this page, you can zoom in/out and also move around the map.
In the GIS-lab computers bokeh module is not installed, so unfortunately at the moment it is not possible for you to test this yourself.
In [10]:
# Import bokeh stuff to visualize our map
from bokeh.models.glyphs import Patch
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import GMapPlot, Range1d, ColumnDataSource, PanTool, WheelZoomTool, BoxSelectTool, GMapOptions, ResizeTool
# Plot the data using bokeh - Initialize
output_notebook()
# Parse coordinates from the GeoDataFrames first row [i.e. index --> 0]
x_coords, y_coords = x, y = data['geometry'].values[0].exterior.coords.xy
# Set the ranges for the map ("how far the user can go")
x_range = Range1d(24.9, 25.0)
y_range = Range1d(60.15, 60.2)
# Set the Google Maps to Helsinki
map_options = GMapOptions(lat=60.17, lng=24.952, map_type="roadmap", zoom=15)
# Create an interactive Google map to Helsinki
plot = GMapPlot(
x_range=x_range, y_range=y_range,
map_options=map_options,
title = "Helsinki - Senaatintori",
plot_width=500,
plot_height=500
)
# Create a Data source
source = ColumnDataSource(
data=dict(
lat=y_coords,
lon=x_coords
)
)
# Create the Patch i.e. the polygon
poly_patch = Patch(x="lon",y="lat", fill_color='#DE2D26', fill_alpha=0.8)
# Add Polygon Patch on top of the map
plot.add_glyph(source, poly_patch)
# Add tools
plot.add_tools(PanTool(), WheelZoomTool(), ResizeTool(), BoxSelectTool())
# Show the map
show(plot)
In [11]:
x_coords
Out[11]: