Geocoding, i.e. converting addresses into coordinates or vice versa, is a really common GIS task. Luckily, in Python there are nice libraries that makes the geocoding really easy. One of the libraries that can do the geocoding for us is geopy that makes it easy to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocoders and other data sources.
As said, Geopy uses third-party geocoders - i.e. services that does the geocoding - to locate the addresses and it works with multiple different service providers such as:
Thus, there is plenty of geocoders where to choose from! However, to be able to use these services you might need to request so called API access-keys from the service provider to be able to use the service. You can get your access keys to e.g. Google Geocoding API from Google APIs console by creating a Project and enabling a that API from Library. Read a short introduction about using Google API Console from here.
There are also other Python modules in addition to geopy that can do geocoding such as Geocoder.
It is also possible to do geocoding in Geopandas using its integrated functionalities of geopy. Geopandas has a function called
geocode() that can geocode a list of addresses (strings) and return a GeoDataFrame containing the resulting point objects in
geometry column. Nice, isn't it! Let's try this out.
Download a text file called addresses.txt that contains few addresses around Helsinki Region. The first rows of the data looks like following:
``` id;address 1000;Itämerenkatu 14, 00101 Helsinki, Finland 1001;Kampinkuja 1, 00100 Helsinki, Finland 1002;Kaivokatu 8, 00101 Helsinki, Finland 1003;Hermanstads strandsväg 1, 00580 Helsingfors, Finland ```
In :# Import necessary modules import pandas as pd import geopandas as gpd from shapely.geometry import Point # Import the geocoding function from geopandas.tools import geocode # Filepath fp = r"/home/geo/addresses.txt" fp = r"C:\HY-Data\HENTENKA\KOODIT\Opetus\Automating-GIS-processes\AutoGIS-Sphinx\source\data\addresses.txt" # Read the data data = pd.read_csv(fp, sep=';') # Let's take a look of the data print(data.head())
id address 0 1000 Itämerenkatu 14, 00101 Helsinki, Finland 1 1001 Kampinkuja 1, 00100 Helsinki, Finland 2 1002 Kaivokatu 8, 00101 Helsinki, Finland 3 1003 Hermanstads strandsväg 1, 00580 Helsingfors, F... 4 1004 Itäväylä, 00900 Helsinki, Finland
Now we have our data in a Pandas DataFrame and we can geocode our addresses
In :from geopandas.tools import geocode # Key for our Google Geocoding API # Notice: only the cloud computers of our course can access and successfully execute the following key = 'AIzaSyAwNVHAtkbKlPs-EEs3OYqbnxzaYfDF2_8' # Geocode addresses geo = geocode(data['address'], api_key=key) print(geo.head(2))
address geometry 0 Itämerenkatu 14, 00180 Helsinki, Finland POINT (24.9146767 60.1628658) 1 Kampinkuja 1, 00100 Helsinki, Finland POINT (24.9301701 60.1683731)
And Voilà! As a result we have a GeoDataFrame that contains our original address and a 'geometry' column containing Shapely Point -objects that we can use for exporting the addresses to a Shapefile for example. However, the
id column is not there. Thus, we need to join the information from
data into our new GeoDataFrame
geo, thus making a Table Join.
Table joins are again something that you need to really frequently when doing GIS analyses. Combining data from different tables based on common
key attribute can be done easily in Pandas/Geopandas using .merge() -function.
geoDataFrames together based on common column
onis used to determine the common key in the tables. If your key in the first table would be named differently than in the other one, you can also specify them separately for each table by using
In :# Join tables by using a key column 'address' join = geo.merge(data, on='address') # Let's see what we have print(join.head()) # Let's also check the data type type(join)
address \ 0 Kampinkuja 1, 00100 Helsinki, Finland 1 Kaivokatu 8, 00101 Helsinki, Finland 2 Hermanstads strandsväg 1, 00580 Helsingfors, F... 3 Itäväylä, 00900 Helsinki, Finland 4 Tyynenmerenkatu 9, 00220 Helsinki, Finland geometry id 0 POINT (24.9301701 60.1683731) 1001 1 POINT (24.9418933 60.1698665) 1002 2 POINT (24.9774004 60.18735880000001) 1003 3 POINT (25.0919641 60.21448089999999) 1004 4 POINT (24.9214846 60.1565781) 1005Out:geopandas.geodataframe.GeoDataFrame
As a result we have a new GeoDataFrame called
join where we now have all original columns plus a new column for
In :# Output file path outfp = r"/home/geo/addresses.shp" outfp = r"C:\HY-Data\HENTENKA\KOODIT\Opetus\Automating-GIS-processes\AutoGIS-Sphinx\source\data\addresses.shp" # Save to Shapefile join.to_file(outfp)
That's it. Now we have successfully geocoded those addresses into Points and made a Shapefile out of them.
Task: Make a map out of the points. What do you think that the addresses are representing?