Juan Shishido
School of Information
GSR, D-Lab
In [1]:
import json
import requests
import pandas as pd
from pprint import pprint
A function with options for both DSTK and Photon.
This is just for demonstration purposes. In most situations, you'll probably not want to combine both the DSTK and Photon APIs into a single function. Of course, it's based on preference, so you might, in fact, want to do that. Just know that Photon provides more information (even multiple results, in some cases) than DSTK.
In [2]:
def single_address(address, api='dstk'):
'''
Individual address lookup with
either DSTK or Photon
Default is DSTK's /street2coordinates
For DSTK's Google-style: 'google'
For Photon: 'photon'
Address must be a string
'''
# API check
assert api in ('dstk', 'google', 'photon')
# Type check
assert type(address) == str
# /street2coordinates
dstk_dstk = 'http://www.datasciencetoolkit.org/street2coordinates/'
# Google-style
dstk_google = 'http://www.datasciencetoolkit.org/maps/api/geocode/json?sensor=false&address='
# Photon
photon = 'http://photon.komoot.de/api/?q='
# API
if api == 'dstk':
url_prefix = dstk_dstk
elif api == 'google':
url_prefix = dstk_google
elif api == 'photon':
url_prefix = photon
# URL
url = url_prefix + address.replace(' ', '+')
# Response
response = requests.get(url)
return json.loads(response.text)
In [3]:
google_hq = single_address('1600 Amphitheatre Pkwy, Mountain View, CA')
pprint(google_hq)
In [4]:
google = single_address('1600 Amphitheatre Pkwy, Mountain View, CA', 'google')
pprint(google)
DSTK provides a Google-style option to make it easier for people already using Google's geocoding API. Simply replace maps.googleapis.com
with www.datasciencetoolkit.org
.
In [5]:
google = single_address('1600 Amphitheatre Pkwy, Mountain View, CA', 'photon')
pprint(google)
Photon provides much more data than DSTK. In can also returns multiple entries. In those cases, you'll need to parse through the JSON to get what you need.
The standard IPython kernel allows running code in other languages using the %%magic syntax.
You can use cURL to access the DSTK API and even save the output to a file. First, invoke the bash
magic command. It is only active in the cell in which it's called.
With the code below, you're submitting a POST request to the DSTK server. It prints the results and provides a table with additional information.
In [6]:
%%bash
curl -d "1600 Amphitheatre Pkwy, Mountain View, CA" \
http://www.datasciencetoolkit.org/street2coordinates
Note: The backslash in the command is simply to allow us to type the command across multiple lines.
This command has three main components (listed from back-to-front):
http://www.datasciencetoolkit.org/street2coordinates
-d @data/bartaddresses.txt
-o data/bartcoordinates.json
The first of these tells cURL the location of the API.
The next one relates to the data to be sent to the API. This uses the -d
flag. Use the @
symbol to indicate that the addresses should be read from a file.
The -o
flag and the argument that follows it, tells cURL to save the output to a file named bartcoordinates.json
.
In [7]:
%%bash
curl -o data/bartcoordinates.json -d @data/bartaddresses.txt \
http://www.datasciencetoolkit.org/street2coordinates
Look in ./data
to find your geocoded addresses.
In [8]:
json_data = pd.read_json('data/bartcoordinates.json').T
In [9]:
stations = pd.read_csv('data/bartstations.csv')
In [10]:
json_data = json_data.reset_index()
json_data = json_data.rename(columns = {'index':'address'})
In [11]:
json_data['address'] = json_data['address'].str.lower()
In [12]:
stations['address'] = stations['address'].str.lower()
First, create a key field in json_data
, using its index. Then, merge json_data
and stations
on address
.
In [13]:
bart = json_data.merge(stations, on = 'address', how='inner')
In [14]:
geo_data = {
'type': 'FeatureCollection',
'features': []
}
for i in bart.index:
feature = {
'type': 'Feature',
'geometry': {
"type": "Point",
"coordinates": [float(bart['longitude'][i]), float(bart['latitude'][i])]
},
'properties': {
'station_name': bart['station_name'][i]
}
}
# Add the feature into the GeoJSON wrapper
geo_data['features'].append(feature)
with open('map/geojson/bart_coords.geojson', 'wb') as f:
json.dump(geo_data, f, indent=2)
In [15]:
with open('map/geojson/bart_coords.geojson', 'rb') as infile:
lines = infile.readlines()
with open('map/geojson/bart_coords.js', 'wb') as outfile:
outfile.write('var bart = ')
outfile.writelines(lines)
infile.close()
outfile.close()
In [ ]: