Comparing journey times returned by Mapzen's OSRM and Valhalla instances, and the Google Maps Directions API

If you'd like to run this analysis locally, install the packages from requirements.txt

Commit 2a64889 has the old 'dirty' version of this notebook, including route retrieval functionality


In [1]:
import math
from lxml import etree
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn import linear_model
import matplotlib as mpl
from matplotlib.collections import PatchCollection
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from mpl_toolkits.mplot3d import Axes3D
import json
from itertools import chain
import requests
from shapely.geometry import (
    mapping, Point, Polygon,
    LineString, MultiLineString, MultiPolygon, box)

from geojson import (
    Feature, FeatureCollection,
    LineString as gj_ls,
    MultiLineString as gj_ml,
    dumps as gj_dumps)

from shapely.ops import unary_union
import fiona
from descartes import PolygonPatch

import statsmodels
import statsmodels.formula.api as sm

import ipywidgets as widgets
from ipywidgets import interact
from IPython.display import display

from helpers import (
    decode_polyline,
    project_linestring,
    query_route_gmaps,
    query_route_osrm,
    query_route_valhalla,
    query_elevation,
    similarity,
    segmentise)

from IPython.core.display import HTML

%matplotlib inline
mpl.rcParams['figure.figsize'] = (16, 12)

Skip the next cell if you don't want font customisation


In [2]:
from matplotlib import rc
rc('font', **{'family':'sans-serif',
    'sans-serif':['Helvetica'],
    'monospace': ['Inconsolata'],
    'serif': ['Adobe Garamamond Pro']})

Skip the next cell if you don't have LaTeX installed


In [3]:
rc('text', **{'usetex': True})
rc('text', **{'latex.preamble': '\usepackage{sfmath}'})

The journey times and routes were retrieved by calling Mapzen's OSRM and Valhalla instances, respectively, and by calling the Google Maps Directions API.

helpers.py includes functions and examples for retrieving the above data into a Pandas DataFrame

Journeys were calculated from the approximate London Bike Share network centroid, on Westminster Bridge, to each station in the network.
Calls to OSRM used viaroute=bicycle, calls to Valhalla used {costing: bicycle}
No costing options were set for Valhalla or Google Maps.

The journey times were retrieved from ['route_summary']['total_time'] for OSRM
The journey times were retrieved from ['trip']['summary']['time'] for Valhalla
The journey times were retrieved by summing the duration value for each step in each leg from route[0] for Google Maps.


In [129]:
routes = pd.read_csv("complete.csv", index_col=0)

Journey origin is:

-0.12203999999842599, 51.500829999995766 (lon, lat)
27054.81221603953, 25108.754540393566 (projected coordinates)

centroid_distance is calculated using pythagoras' theorem
travel_time is calculated in minutes (returned journey time / 60)


In [130]:
routes.head()


Out[130]:
gmaps_route name osrm_route travel_time_gmaps travel_time_osrm travel_time_valhalla valhalla_route lat lon station_elevation
0 ewiyHxyVsMmA{JmAmMuBUHq@lAGZtBxAGR@BFFCJgBsA{B... River Street , Clerkenwell osjfaBnhmFu@xm@p@zOHvDKhUAzCE`EHzSMjJuDD}FGiFi... 20.666667 18.566667 17.133333 isjfaBljmFfCi`CNqMNc_@sAsPFuHDcJFsG\}l@?kCNuFu... 51.529163 -0.109971 34.723175
1 ewiyHxyVsMmA{JmAmMuBUHq@lAGZtBxAFD~BpBNJGRWnB]... Phillimore Gardens, Kensington osjfaBnhmFu@xm@p@zOHvDKhUAzCE`EHzSMjJGtHI`Fm@r... 31.283333 25.950000 17.333333 isjfaBljmF}@~k@t@zOFvDGfUGzC?~DF|SOlJGtHE~Eu@t... 51.499607 -0.197574 18.670319
2 ewiyHxyVsMmA{JmAmMuBUHq@lAGZtBxAGR@BFFCJgBsA{B... Christopher Street, Liverpool Street osjfaBnhmFnCo~BJqMNc_@oAsPBsHFcJFsGZ_m@N}JeBU}... 22.183333 19.116667 19.300000 isjfaBljmFfCi`CNqMNc_@sAsPFuHDcJFsG\}l@?kCNuFu... 51.521284 -0.084606 27.974607
3 ewiyHxyVsMmA{JmAmMuBUHq@lAGZtBxAGR@BFFCJgBsA{B... St. Chad's Street, King's Cross osjfaBnhmFu@xm@p@zOHvDKhUAzCE`EHzSMjJuDD}FGiFi... 22.550000 18.916667 16.233333 isjfaBljmFfCi`CNqMNc_@sAsPFuHDcJFsG\}l@?kCNuFu... 51.530059 -0.120974 21.092430
4 ewiyHxyVt_@rDjJlAfBp@vFbE^\tAjCDHBNCHE@CH@D|Dv... Sedding Street, Sloane Square osjfaBnhmFu@xm@p@zOHvDKhUAzCE`EHzSMjJGtHI`Fm@r... 19.800000 13.533333 10.950000 isjfaBljmF}@~k@t@zOFvDGfUGzC?~DF|SOlJGtHE~Eu@t... 51.493130 -0.156876 14.857051

Cluster journeys using k-means++

(This isn't terribly enlightening, it turns out)

Visualising Differences in Route Geography

As a next step, we're going to visualise the different routes returned by Valhalla and OSRM. In order to do this, we need to decode the polyline-encoded route strings, collect these, and map them.

We'll need the Basemap matplotlib extension in order to create these static maps.


In [131]:
for provider in ['osrm', 'valhalla', 'gmaps']:
    if provider == 'gmaps':
        routes['%s_decoded' % provider] = routes.apply(
            lambda f: decode_polyline(f['%s_route' % provider], gmaps=True), axis=1)
    else:
        routes['%s_decoded' % provider] = routes.apply(
            lambda f: decode_polyline(f['%s_route' % provider]), axis=1)

Now set up a map, and project our routes into map coordinates


In [132]:
bds = MultiLineString(list(routes['valhalla_decoded'])).bounds
extra = 0.01
ll = (bds[0], bds[1])
ur = (bds[2], bds[3])
coords = list(chain(ll, ur))
w, h = coords[2] - coords[0], coords[3] - coords[1]

In [133]:
m = Basemap(
    projection='tmerc',
    lon_0 = -0.12203999999842599,
    lat_0 = 51.500829999995766,
    ellps = 'WGS84',
    llcrnrlon=coords[0] - extra * w,
    llcrnrlat=coords[1] - extra + 0.01 * h,
    urcrnrlon=coords[2] + extra * w,
    urcrnrlat=coords[3] + extra + 0.01 * h,
    resolution='i',
    suppress_ticks=True)

We can now create the projected versions of our routes, and combine each provider's routes into a MultiLineString


In [134]:
# first, calculate projected coordinates and centroid distance using our Basemap instance
centroid = m(-0.12204, 51.50083)

routes['projected_lon'], routes['projected_lat'] = m(
    *(routes["lon"].values, routes["lat"].values))

routes['centroid_distance'] = routes.apply(
    lambda x: math.sqrt(((abs(centroid[0] - x['projected_lon']) ** 2) +
                         (abs(centroid[1] - x['projected_lat']) ** 2))), axis=1)
combined_routes = {}
for provider in ['osrm', 'valhalla', 'gmaps']:
    # create projected version of routes
    routes['%s_decoded_proj' % provider] = routes.apply(
        lambda f: project_linestring(f['%s_decoded' % provider], m), axis=1)
    # combine projected route LineStrings
    combined_routes[provider] = MultiLineString(list(routes['%s_decoded_proj' % provider]))
    # we can also calculate the projected route distances (as opposed to euclidean distance, as above)
    routes['%s_route_length_proj' % provider] = routes.apply(
        lambda f: f['%s_decoded_proj' % provider].length / 1000, axis=1)

Let's carry out some regressions in order to determine time/great-circle route distance correlation


In [135]:
# OSRM regression
regressions = []
for provider in ['osrm', 'valhalla', 'gmaps']:
    regressions.append(
        sm.ols(formula='%s_route_length_proj ~ travel_time_%s' % (provider, provider),
        data=routes).fit())

for res in regressions:
    print res.summary2()


                     Results: Ordinary least squares
========================================================================
Model:              OLS                    Adj. R-squared:     0.997    
Dependent Variable: osrm_route_length_proj AIC:                -944.7353
Date:               2015-08-26 00:12       BIC:                -935.6048
No. Observations:   710                    Log-Likelihood:     474.37   
Df Model:           1                      F-statistic:        2.453e+05
Df Residuals:       708                    Prob (F-statistic): 0.00     
R-squared:          0.997                  Scale:              0.015432 
-------------------------------------------------------------------------
                     Coef.   Std.Err.     t      P>|t|    [0.025   0.975]
-------------------------------------------------------------------------
Intercept           -0.1292    0.0112  -11.5699  0.0000  -0.1511  -0.1073
travel_time_osrm     0.2304    0.0005  495.3078  0.0000   0.2295   0.2313
------------------------------------------------------------------------
Omnibus:                272.224        Durbin-Watson:           1.850   
Prob(Omnibus):          0.000          Jarque-Bera (JB):        2366.066
Skew:                   -1.475         Prob(JB):                0.000   
Kurtosis:               11.443         Condition No.:           58      
========================================================================

                       Results: Ordinary least squares
============================================================================
Model:              OLS                        Adj. R-squared:     0.937    
Dependent Variable: valhalla_route_length_proj AIC:                1279.7526
Date:               2015-08-26 00:12           BIC:                1288.8832
No. Observations:   710                        Log-Likelihood:     -637.88  
Df Model:           1                          F-statistic:        1.060e+04
Df Residuals:       708                        Prob (F-statistic): 0.00     
R-squared:          0.937                      Scale:              0.35408  
-----------------------------------------------------------------------------
                         Coef.   Std.Err.     t      P>|t|    [0.025   0.975]
-----------------------------------------------------------------------------
Intercept               -0.1766    0.0562   -3.1398  0.0018  -0.2870  -0.0662
travel_time_valhalla     0.3047    0.0030  102.9431  0.0000   0.2989   0.3105
----------------------------------------------------------------------------
Omnibus:                  71.678          Durbin-Watson:             1.459  
Prob(Omnibus):            0.000           Jarque-Bera (JB):          145.979
Skew:                     -0.606          Prob(JB):                  0.000  
Kurtosis:                 4.861           Condition No.:             48     
============================================================================

                     Results: Ordinary least squares
=========================================================================
Model:              OLS                     Adj. R-squared:     0.925    
Dependent Variable: gmaps_route_length_proj AIC:                1332.6621
Date:               2015-08-26 00:12        BIC:                1341.7926
No. Observations:   710                     Log-Likelihood:     -664.33  
Df Model:           1                       F-statistic:        8803.    
Df Residuals:       708                     Prob (F-statistic): 0.00     
R-squared:          0.926                   Scale:              0.38148  
--------------------------------------------------------------------------
                      Coef.   Std.Err.     t      P>|t|    [0.025   0.975]
--------------------------------------------------------------------------
Intercept            -0.9727    0.0745  -13.0514  0.0000  -1.1190  -0.8264
travel_time_gmaps     0.2672    0.0028   93.8236  0.0000   0.2616   0.2728
-------------------------------------------------------------------------
Omnibus:                39.699          Durbin-Watson:             1.229 
Prob(Omnibus):          0.000           Jarque-Bera (JB):          40.051
Skew:                   0.540           Prob(JB):                  0.000 
Kurtosis:               2.566           Condition No.:             84    
=========================================================================

The OSRM journeys look almost perfectly correlated, whereas the Valhalla journeys don't.

WHAT IS GOING ON WITH GOOGLE MAPS THOUGH


In [136]:
dotsize = 2.
linewidth = .5
fontsize = 12

plt.clf()
fig = plt.figure(figsize=(7.28, 5.25))

# define a grid of subplots
axes = [
    plt.subplot2grid((2, 3), (0, 0)),
    plt.subplot2grid((2, 3), (0, 1)),
    plt.subplot2grid((2, 3), (0, 2)),
    plt.subplot2grid((2, 3), (1, 0)),
    plt.subplot2grid((2, 3), (1, 1)),
    plt.subplot2grid((2, 3), (1, 2))
    ]

colours = ["#008080", "#E87600", "#9B30FF"]
# TODO sharex / sharey
for idx, provider in enumerate(['osrm', 'valhalla', 'gmaps']):
    ax = axes[idx]
    hist = routes['travel_time_%s' % provider].plot(
        ax=ax,
        kind='hist',
        color=colours[idx],
        edgecolor='#333333',
        bins=21,
        alpha=0.85,
        legend=True)

    ax.set_xlabel('Journey Duration (minutes)', fontsize=fontsize)
    ax.set_ylabel('Number of Journeys', fontsize=fontsize)
    ax.set_xlim(0, 50)
    ax.set_ylim(0, 75)
    leg = ax.legend(["%s" % provider], fontsize=9)
    leg.get_frame().set_alpha(0.5)
    ax.grid(b=False)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.set_axis_bgcolor('none')
    
    if idx != 0:
        ax.set_yticklabels([])
        ax.set_ylabel('')

for idx, provider in enumerate(['osrm', 'valhalla', 'gmaps']):
    ax = axes[idx + 3]
    osrm_cluster = ax.scatter(
        x=routes['travel_time_%s' % provider],
        y=routes['%s_route_length_proj' % provider],
        marker='o',
        color=colours[idx],
        edgecolor='#333333',
        lw=linewidth,
        s=dotsize,
        alpha=0.55,)

    ax.set_xlabel('Journey Duration (minutes)', fontsize=fontsize)
    ax.set_ylabel('Great-circle Distance (km)', fontsize=fontsize)
    ax.grid(b=False)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.set_axis_bgcolor('none')
    leg = ax.legend(
        [r'$R^2$: %0.3f)' % (regressions[idx].rsquared)],
        scatterpoints=1,
        loc='lower right',
        fontsize=9)
    leg.get_frame().set_alpha(0.5)
    ax.set_xlim(0, 50)
    ax.set_ylim(0, 12)
    if idx + 3 != 3:
        ax.set_yticklabels([])
        ax.set_ylabel('')

axes[0].set_title('Distribution of %s routes' % len(routes), fontsize=10)
axes[3].set_title('Duration vs Distance', fontsize=10)
        
        
# Save etc
plt.tight_layout()
fig.set_size_inches(12, 9)
plt.savefig(
    'combined.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=True,
    dpi=300)

plt.show()


<matplotlib.figure.Figure at 0x10fbec110>

As is obvious from the histograms, the journey times and their distribution vary significantly between the three routers; Valhalla thinks you can get to any station in just over 35 minutes, with the majority taking 15–20 minutes. OSRM's journey times are far more evenly distributed, and greater in duration. Most (though not all) of the variation in duration is explained by Valhalla's default cycle speed of 25kph, versus OSRM's 15kph. When cycling speed and other costing options become available in the API, it'll be much easier to make more nuanced comparisons.

An R-squared of .997 implies almost-perfect correlation, in the case of OSRM.
R-squared values of .937 (Valhalla) and .926 (Google Maps) imply high correlation, but there are clearly some factors we aren't taking into account.

Some possible conclusions:

  • The costs assigned by OSRM do not seem to vary with journey length, or any other measure (elevation, number of turns, traffic lights etc.)
  • By comparison, Valhalla and Google Maps do seem to be assigning costs in a different way according to certain route characteristics
    • Based on variation of returned routes, Google Maps is using live data (perhaps traffic) as a factor when determining an optimal route

We can also do some clustering, though this isn't terribly informative, as it turns out


In [19]:
centroids = {}
for provider in ['osrm', 'valhalla', 'gmaps']:
    km = KMeans(init='k-means++', n_clusters=7, n_init=10)
    km.fit(routes[['centroid_distance', 'travel_time_%s' % provider]])
    routes['kmeans_bin_%s' % provider] = km.labels_
    centroids[provider] = km.cluster_centers_

We can also calculate the percentage of route segments each route has in common


In [20]:
routes['osrm_valhalla_similarity'] = routes.apply(
    lambda f: similarity(f['osrm_decoded'], f['valhalla_decoded']), axis=1)

routes['osrm_gmaps_similarity'] = routes.apply(
    lambda f: similarity(f['osrm_decoded'], f['gmaps_decoded']), axis=1)

routes['valhalla_gmaps_similarity'] = routes.apply(
    lambda f: similarity(f['valhalla_decoded'], f['gmaps_decoded']), axis=1)

In [21]:
plt.clf()
fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(111)

routes['osrm_valhalla_similarity'].plot(
        ax=ax,
        kind='hist',
        edgecolor='#333333',
        bins=21,
        alpha=0.85,
        legend=True)

ax.set_xlabel('Route segments in common (\%)', fontsize=fontsize)
ax.set_ylabel('Number of routes', fontsize=fontsize)
leg = ax.legend(["OSRM \& Valhalla"], fontsize=9)
leg.get_frame().set_alpha(0.5)
ax.grid(b=False)
ax.set_axis_bgcolor('none')

plt.show()


<matplotlib.figure.Figure at 0x110517610>

Outlier detection on the Valhalla data using RANSAC


In [22]:
time_reshaped = np.reshape(
    routes['travel_time_valhalla'], (len(routes['travel_time_valhalla']), 1))
route_reshaped = np.reshape(
    routes['valhalla_route_length_proj'], (len(routes['valhalla_route_length_proj']),))

In [23]:
model_ransac = linear_model.RANSACRegressor(linear_model.LinearRegression(), max_trials=1000)
res = model_ransac.fit(time_reshaped, route_reshaped)
inlier_mask = model_ransac.inlier_mask_
outlier_mask = np.logical_not(inlier_mask)

np.sum(outlier_mask)


Out[23]:
4

In [24]:
plt.clf()
fig = plt.figure(figsize=(16, 12))
ax = fig.add_subplot(111)

plt.scatter(
    time_reshaped[inlier_mask], route_reshaped[inlier_mask],
    color="#008080",
    label='Inliers',
    marker='o',
    edgecolor='#333333',
    alpha=.85)

plt.scatter(
    time_reshaped[outlier_mask], route_reshaped[outlier_mask],
    color="#E87600",
    label='Outliers (%s)' % np.sum(outlier_mask),
    marker='o',
    edgecolor='#333333',
    alpha=.85)

ax.set_xlabel('Journey Duration (minutes)', fontsize=fontsize)
ax.set_ylabel('Great-circle Distance (km)', fontsize=fontsize)
plt.title("Valhalla Routes, Outliers Identified (RANSAC)")

plt.legend()
plt.tight_layout()
fig.set_size_inches(12, 9)
plt.savefig(
    'RANSAC.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=True,
    dpi=72)
plt.show()


<matplotlib.figure.Figure at 0x113057a10>

Using the resulting 'mask' arrays, we can see the outlier destinations, and how long Google thinks it should take


In [25]:
routes.loc[outlier_mask][['name', 'valhalla_route_length_proj', 'gmaps_route_length_proj']]


Out[25]:
name valhalla_route_length_proj gmaps_route_length_proj
659 Pembridge Villas, Notting Hill 6.802932 7.336343
660 South Kensington Station, South Kensington 7.071861 5.199306
671 Harrowby Street, Marylebone 5.020362 5.376170
703 Gwendwr Road, West Kensington 8.663382 8.442908

And map them…


In [26]:
plt.clf()
fig = plt.figure(figsize=(16, 12), dpi=300)
ax = fig.add_subplot(111, axisbg='w', frame_on=False)

m.fillcontinents('#404040')
m.drawparallels(
    np.arange(ll[0], ur[0], 2.),
    color = 'black', linewidth = 0.5,
    labels=[True, False, False, False], ax=ax)
m.drawmeridians(
    np.arange(ll[1], ur[1], 2.),
    color = 'black', linewidth = 0.5,
    labels=[False, False, False, True], ax=ax)

pc_valhalla = PatchCollection(
    [PolygonPatch(
        line.buffer(20)) for line in MultiLineString(
            list(routes.loc[outlier_mask]['valhalla_decoded_proj']))],
    match_original=False, alpha=.25, color="#E87600", lw=.001, zorder=4)
pc_valhalla.set_label("valhalla")
ax.add_collection(pc_valhalla)

m.scatter(
    routes.loc[outlier_mask]['projected_lon'],
    routes.loc[outlier_mask]['projected_lat'],
    s=50,
    color='#ffffff',
    edgecolor='#000000',
    ax=ax,
    zorder=4)

thames = m.readshapefile(
    'thames_wgs84',
    'thames',
    color='none',
    zorder=4)
thames_poly = unary_union([Polygon(xy) for xy in m.thames])
tp = PatchCollection(
    [PolygonPatch(poly) for poly in thames_poly],
    match_original=False,
    color='#97a7a7', lw=.25, alpha=1., zorder=3)
ax.add_collection(tp)
plt.title("Shorter-than-expected Routes, Valhalla (%s Routes)" % len(routes.loc[outlier_mask]))

plt.tight_layout()
fig.set_size_inches(12, 9)
plt.savefig(
    'outliers_mapped.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=True,
    dpi=72)

plt.show()


<matplotlib.figure.Figure at 0x112843650>

Visualising all routes on a map


In [27]:
plt.clf()
fig = plt.figure(figsize=(16, 12), dpi=300)
ax = fig.add_subplot(111, axisbg='w', frame_on=False)

m.fillcontinents('#404040')
m.drawparallels(
    np.arange(ll[0], ur[0], 2.),
    color = 'black', linewidth = 0.5,
    labels=[True, False, False, False], ax=ax)
m.drawmeridians(
    np.arange(ll[1], ur[1], 2.),
    color = 'black', linewidth = 0.5,
    labels=[False, False, False, True], ax=ax)

# set up our route patch collections
patchcollections = {}
for provider, colour in [("osrm", "#008080"), ("valhalla", "#E87600"), ("gmaps", "#9B30FF")]:
    pc = PatchCollection([PolygonPatch(
    line.buffer(10)) for line in combined_routes[provider]],
    match_original=False, alpha=.25, color=colour, lw=.001, zorder=4)
    pc.set_label(provider)
    patchcollections[provider] = pc
    # ax.add_collection(pc)

# cascade Thames polygons together, and plot
thames = m.readshapefile(
    'thames_wgs84',
    'thames',
    color='none',
    zorder=4)
thames_poly = unary_union([Polygon(xy) for xy in m.thames])
tp = PatchCollection(
    [PolygonPatch(poly) for poly in thames_poly],
    match_original=False,
    color='#97a7a7', lw=.25, alpha=1., zorder=3)
ax.add_collection(tp)

# add centroid
centroid = m(-0.12204, 51.50083)
m.scatter(*centroid,
    s=5., edgecolor='#000000', alpha=1.,
    color='w', zorder=5)

# fake a legend. Tsk
lt = m.scatter([], [], color="#008080")
lo = m.scatter([], [], color="#E87600")
lg = m.scatter([], [], color="#9B30FF")

labels = ["OSRM", "Valhalla", "Google Maps"]
leg = plt.legend([lt, lo, lg], labels)
leg.get_frame().set_alpha(0.5)

plt.title("Routes from Bike Share Network Centroid to %s Stations" % len(routes))
plt.tight_layout()
plt.savefig('routes.png', dpi=300, bbox_inches='tight', alpha=True, transparent=True)

# convenience functions for adding and removing PatchCollections 
def remove_patches(ax, label):
    """ find and remove a PatchCollection by name """
    for idx, pc in enumerate(ax.collections):
        if pc.get_label() == label:
            ax.collections.pop(idx)

def display_osrm(OSRM):
    """ Display or remove OSRM layer according to boolean input """
    if OSRM:
        ax.add_collection(patchcollections['osrm'])
    else:
        remove_patches(ax, "osrm")
    display(fig)            

def display_valhalla(Valhalla):
    """ Display or remove Valhalla layer according to boolean input """
    if Valhalla:
        ax.add_collection(patchcollections['valhalla'])
    else:
        remove_patches(ax, "valhalla")
    display(fig)
    
def display_gmaps(GMaps):
    """ Display or remove GMaps layer according to boolean input """
    if GMaps:
        ax.add_collection(patchcollections['gmaps'])
    else:
        remove_patches(ax, "gmaps")
    display(fig)

plt.close(fig)
interact(display_osrm, OSRM=True)
interact(display_valhalla, Valhalla=True)
interact(display_gmaps, GMaps=True)

plt.show()


None

In [ ]:
plt.savefig(
    'mpl_map.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=True,
    dpi=300)

Outputting our decoded Polyline journeys to GeoJSON

Dumping to GeoJSON is straightforward.

  • import geojson
  • encode each LineString as a Feature
  • wrap the list of Features in a FeatureCollection
  • dump the FeatureCollection as a JSON string

In [ ]:
for provider in ['osrm', 'valhalla', 'gmaps']:
    with open('html/%s_geojson.json' % provider, 'w') as f:
        f.write(gj_dumps(FeatureCollection(
            [Feature(geometry=feat) for feat in list(routes['%s_decoded' % provider])])))

In [ ]:
def route_valhalla(df, start):
    return query_route_valhalla(api_key, start, (df['lon'], df['lat']), 'bicycle')

def route_gmaps(df, start):
    return query_route_gmaps(start, (df['lon'], df['lat']), 'bicycling', gmaps_key)

def route_osrm(df, start):
    return query_route_osrm(start, (df['lon'], df['lat']), 'bicycle')

In [ ]:
# Get updated stations from TfL XML
# parse XML into dict
tree = etree.parse("livecyclehireupdates.xml")
root = tree.getroot()

output = dict()
output['name'] = []
output['lon'] = []
output['lat'] = []

for each in root.xpath('station'):
    output['name'].append(each[1].text)
    output['lon'].append(each[4].text)
    output['lat'].append(each[3].text)

stations = pd.DataFrame(output)
stations[['lon', 'lat']] = stations[['lon', 'lat']].astype(float)

In [ ]:
len(stations)

In [ ]:
# we only want to keep stations NOT IN our route dataframe
stations = stations[-stations['name'].isin(routes['name'])]
len(stations)

There is no route available to each of the following stations from one or more OpenStreetMap routers


In [ ]:
def table_gen(header_text, iterable):
    """ Incredibly hacky Markdown table generator """
    header = "|{f} ({g} stations)|\n:---------:\n".format(f=header_text, g=len(iterable))
    row = ("|{t}|".format(t=text) for text in iterable)
    return header + "\n".join(row)

In [ ]:
print table_gen("Station Name", stations['name'])

Begin to retrieve missing stations from our routers


In [ ]:
stations['travel_time_gmaps'], stations['gmaps_route'] = zip(*
    stations.apply(route_gmaps, args=((-0.12203999999842599, 51.500829999995766),), axis=1))

stations = stations.dropna()
len(stations)

In [ ]:
stations['travel_time_osrm'], stations['osrm_route'] = zip(*
    stations.apply(route_osrm, args=((-0.12203999999842599, 51.500829999995766),), axis=1))

stations = stations.dropna()
len(stations)

In [ ]:
stations['travel_time_valhalla'], stations['valhalla_route'] = zip(*
    stations.apply(route_valhalla, args=((-0.12203999999842599, 51.500829999995766),), axis=1))

stations = stations.dropna()
len(stations)

In [ ]:
stations['travel_time_valhalla'] = stations['travel_time_valhalla'] / 60
stations['travel_time_osrm'] = stations['travel_time_osrm'] / 60
stations['travel_time_gmaps'] = stations['travel_time_gmaps'] / 60

In [ ]:
centroid = m(-0.12204, 51.50083)

stations['projected_lon'], stations['projected_lat'] = m(*(stations["lon"].values, stations["lat"].values))
stations['centroid_distance'] = stations.apply(
    lambda x: math.sqrt(((abs(centroid[0] - x['projected_lon']) ** 2) +
                         (abs(centroid[1] - x['projected_lat']) ** 2))), axis=1)

In [ ]:
routes = pd.concat(
    [routes, stations],
    ignore_index=True
)

Plotting Station Elevation Data


In [221]:
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, axisbg='w', frame_on=False, projection='3d')
s = ax.plot_trisurf(routes.lon, routes.lat, routes.station_elevation, cmap='coolwarm')
ax.set_zlim(0, 150)
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_zlabel('Elevation (m)')

plt.show()


<matplotlib.figure.Figure at 0x114d46cd0>

Plotting Route Elevation Data


In [300]:
journey_points = pd.read_csv("gmaps_elevations.csv", index_col=0)

In [368]:
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, axisbg='w', frame_on=False, projection='3d')
for row in journey_points.iterrows():
    journey = eval(row[1][0])
    journey_elevations = [station['elevation'] for station in journey]
    journey_x = [station['location']['lng'] for station in journey]
    journey_y = [station['location']['lat'] for station in journey]    
    projected_lon, projected_lat = m(
        *(journey_x, journey_y))
    # "#E87600"
    # "#9B30FF"
    l = ax.plot(projected_lon, projected_lat, journey_elevations, color='#008080', alpha=.5, lw=.5)

ax.scatter(routes.projected_lon, routes.projected_lat, routes.station_elevation, color="#E87600", s=3)
ax.set_zlim(0, 150)
ax.set_xlim(-1000, 16000)
ax.set_xlabel('Projected Longitude')
ax.set_ylabel('Projected Latitude')
ax.set_zlabel('Elevation (m)')

# remove axes etc
ax.grid(False)
for a in (ax.w_xaxis, ax.w_yaxis, ax.w_zaxis):
    for t in a.get_ticklines() + a.get_ticklabels():
        t.set_visible(False)
    a.line.set_visible(False)
    a.pane.set_visible(False)

plt.title("Routes from Westminster Bridge to all Bike Stations")
plt.tight_layout()
fig.set_size_inches(12, 9)
plt.savefig(
    'gmaps_elevations.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=True,
    dpi=300)
plt.show


Out[368]:
<function matplotlib.pyplot.show>
<matplotlib.figure.Figure at 0x110738450>

This doesn't work yet


In [260]:
# y is elevation
# numpy.interp(midpoints, x, y)
journey = journey_points.iloc[0]
decoded = eval(journey[0])
route = decode_polyline(routes.iloc[1]['gmaps_route'], gmaps=True)
segments = segmentise(route)
coords_x = [line.coords[0][0] for line in segments]
coords_elevation = [segment['elevation'] for segment in decoded]
midpoints_x = [line.interpolate(line.length / 2).coords[0][0] for line in segments]

Combine all route elevation data and create a grid


In [325]:
route_points_x = []
route_points_y = []
elevations = []
for route in journey_points.iterrows():
    journey = eval(route[1][0])
    route_points_x.extend([station['location']['lng'] for station in journey])
    route_points_y.extend([station['location']['lat'] for station in journey])
    elevations.extend([station['elevation'] for station in journey])

In [326]:
all_points = pd.DataFrame({'lon': route_points_x, 'lat': route_points_y, 'elevation': elevations})
# project
all_points['projected_lon'], all_points['projected_lat'] = m(
    *(all_points["lon"].values, all_points["lat"].values))

In [356]:
from matplotlib.colors import ColorConverter
conv = ColorConverter()

In [357]:
# get valid RGBA values for dark grey
conv.to_rgba("#404040")
# get centroid
centroid = m(-0.12204, 51.50083)


Out[357]:
(0.25098039215686274, 0.25098039215686274, 0.25098039215686274, 1.0)

In [393]:
from scipy.interpolate import griddata as gd

# set up a square grid with the same extents as our measured data
numcols, numrows = 1000, 1000
xi = np.linspace(all_points['projected_lon'].min(), all_points['projected_lon'].max(), numcols)
yi = np.linspace(all_points['projected_lat'].min(), all_points['projected_lat'].max(), numrows)
# get lon and lat coords of our grid points
xi, yi = np.meshgrid(xi, yi)
# interpolate
zi = gd(
    (all_points[['projected_lon', 'projected_lat']]),
    all_points['elevation'].values,
    (xi, yi),
    method='linear')

In [374]:
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, axisbg='#404040', frame_on=False, projection='3d')
ax.plot_wireframe(xi, yi, zi, color='#008080', lw=.05, alpha=.75)

for row in journey_points.iterrows():
    journey = eval(row[1][0])
    journey_elevations = [station['elevation'] for station in journey]
    journey_x = [station['location']['lng'] for station in journey]
    journey_y = [station['location']['lat'] for station in journey]    
    projected_lon, projected_lat = m(
        *(journey_x, journey_y))
    l = ax.plot(projected_lon, projected_lat, journey_elevations, color='#E87600', alpha=.5, lw=.1)

ax.scatter(
    routes.projected_lon, routes.projected_lat, routes.station_elevation,
    color="#E87600", s=1.5, alpha=.5)
# ax.scatter(centroid[0], centroid[1], [1], color='w', s=5., edgecolor='#000000', alpha=.75)
ax.set_xlim(-1000, 16000)
ax.set_zlim(0, 150)

# ax.w_xaxis.set_pane_color((0.25098039215686274, 0.25098039215686274, 0.25098039215686274, 1.0))
# ax.w_yaxis.set_pane_color((0.25098039215686274, 0.25098039215686274, 0.25098039215686274, 1.0))
# ax.w_zaxis.set_pane_color((0.25098039215686274, 0.25098039215686274, 0.25098039215686274, 1.0))
# remove axes etc
ax.grid(False)
for a in (ax.w_xaxis, ax.w_yaxis, ax.w_zaxis):
    for t in a.get_ticklines() + a.get_ticklabels():
        t.set_visible(False)
    a.line.set_visible(False)
    a.pane.set_visible(False)

ax.set_xlabel('Projected Longitude', color='w')
ax.set_ylabel('Projected Latitude', color='w')
ax.set_zlabel('Elevation (m)', color='w')
plt.title("Interpolated Surface (1000 x 1000 points)", color='w')
fig.set_size_inches(12, 9)
plt.tight_layout()
plt.savefig(
    'gmaps_surface_elevations.png',
    format="png",
    bbox_inches='tight',
    alpha=True,
    transparent=False,
    dpi=300)

plt.show()


<matplotlib.figure.Figure at 0x1048d4390>

In [394]:
plt.clf()
fig = plt.figure(figsize=(16, 12), dpi=300)
ax = fig.add_subplot(111, axisbg='w', frame_on=False)

m.fillcontinents('#404040')
m.drawparallels(
    np.arange(ll[0], ur[0], 2.),
    color = 'black', linewidth = 0.5,
    labels=[True, False, False, False], ax=ax)
m.drawmeridians(
    np.arange(ll[1], ur[1], 2.),
    color = 'black', linewidth = 0.5,
    labels=[False, False, False, True], ax=ax)

# cascade Thames polygons together, and plot
thames = m.readshapefile(
    'thames_wgs84',
    'thames',
    color='none',
    zorder=4)
thames_poly = unary_union([Polygon(xy) for xy in m.thames])
tp = PatchCollection(
    [PolygonPatch(poly) for poly in thames_poly],
    match_original=False,
    color='#97a7a7', lw=.25, alpha=1., zorder=3)
ax.add_collection(tp)

conf = m.contourf(xi, yi, zi, 15, cmap='RdPu', color='#008080', lw=.05, alpha=.65, zorder=4)
cbar = plt.colorbar(conf, orientation='horizontal', fraction=.057, pad=0.05)
cbar.set_label("Elevation (m)")

# add centroid
centroid = m(-0.12204, 51.50083)
m.scatter(*centroid,
    s=5., edgecolor='#000000', alpha=1.,
    color='w', zorder=4)


plt.title("Elevation From Interpolated Values")
plt.tight_layout()
# plt.savefig('routes.png', dpi=300, bbox_inches='tight', alpha=True, transparent=True)


plt.show()


<matplotlib.figure.Figure at 0x115338cd0>

In [ ]: