Since we announced our collaboration with the World Bank and more partners to create the Open Traffic platform, we’ve been busy. We’ve shared two technical previews of the OSMLR linear referencing system. Now we’re ready to share more about how we’re using Mapzen Map Matching to “snap” GPS-derived locations to OSMLR segments, and how we’re using a data-driven approach to evaluate and improve the algorithms.
Mapzen has been testing and matching GPS measurements from some of Open Traffic’s partners since development began, but one burning question remained: were our matches any good? Map-matching real-time GPS traces is one thing, but without on-the-ground knowledge about where the traces actually came from, it was impossible to to determine how close to — or far from — the truth our predictions were.
Our in-house solution was to use Mapzen's very own Turn-By-Turn routing API to simulate fake GPS data, send the synthetic data through the Mapzen Map Matching service, and compare the results to the original routes used to simulate the fake traces. We have documented this process below:
In [1]:
from __future__ import division
from matplotlib import pyplot as plt
from matplotlib import cm, colors, patheffects
import numpy as np
import os
import glob
import urllib
import json
import pandas as pd
from random import shuffle, choice
import pickle
import sys; sys.path.insert(0, os.path.abspath('..'));
import validator.validator as val
%matplotlib inline
In [2]:
mapzenKey = os.environ.get('MAPZEN_API')
gmapsKey = os.environ.get('GOOGLE_MAPS')
The first step in route generation is picking a test region, which for us was San Francisco. Routes are defined as a set of start and stop coordinates, which we obtain by randomly sampling venues from Mapzen’s Who’s on First gazetteer for the specified city. Additionally, we want to limit our route distances to be between ½ km and 1 km because this is the localized scale at which map matching actually takes place.
In this example, we specify 200 fake routes:
In [3]:
cityName = 'San Francisco'
minRouteLen = 0.5 # specified in km
maxRouteLen = 1 # specified in km
numRoutes = 200
In [29]:
# Using Mapzen Venues (requires good Who's on First coverage)
routeList = val.get_routes_by_length(cityName, minRouteLen, maxRouteLen, numRoutes, apiKey=mapzenKey)
## Using Google Maps POIs (better for non-Western capitals):
# routeList = val.get_POI_routes_by_length(cityName, minRouteLen, maxRouteLen, numRoutes, gmapsKey)
A sample route:
In [19]:
myRoute = routeList[2]
myRoute
Out[19]:
For each route, we then pass the start and end coordinates to the Turn-By-Turn API to obtain the coordinates of the road segments along the route:
In [17]:
shape, routeUrl = val.get_route_shape(myRoute)
The Turn-By-Turn API returns the shape of the route as an encoded polyline:
In [18]:
shape
Out[18]:
The route shape then gets passed to the map matching service in order to obtain the coordinates and attributes of the road segments (i.e. edges) that lie along the original route:
In [24]:
edges, matchedPts, shapeCoords, _ = val.get_trace_attrs(shape)
edges = val.get_coords_per_second(shapeCoords, edges, '2768')
We can inspect the attributes returned for our example route:
In [28]:
val.format_edge_df(edges).head()
Out[28]:
Now that we have a set of "ground-truthed" routes, meaning route coordinates and their associated road segments, we want to simulate what the GPS data recorded along these routes might look like. This involves two main steps:
To resample the coordinates, we use the known speeds along each road segment to retain points along the route after every $n$ seconds.
To simulate GPS noise, we randomly sample from a normal distribution with standard deviation equal to a specified level of noise (in meters). We then apply this vector of noise to a given coordinate pair, and use a rolling average to smooth out the change in noise between subsequent "measurements", recreating the phenomenon of GPS "drift".
Since we are interested in assessing the performance of map-matching under a variety of different conditions, we define a range of realistic sample rates and noise levels:
In [21]:
sampleRates = [1, 5, 10, 20, 30] # specified in seconds
noiseLevels = np.linspace(0, 100, 21) # specified in meters
In order to validate our matched routes, we need an effective method of scoring. Should Type I error (false negatives) carry as much weight as Type II (false negatives)? Should a longer mismatched segment carry a greater penalty than one that is shorter?
We adopt the scoring mechanism proposed by Newton and Krumm (2009), upon whose map-matching algorithm the Meili service is based:
In the above schematic, $\text{d}_+$ refers to a false positive or Type I error, while $\text{d}_-$ represents a false negative, or Type II error. The final reported error is a combination of both types of mismatches, weighted by their respective lengths.
Behind the scenes, the get_route_metrics()
function will perform the following actions:
In [22]:
matchDf, _, _ = val.get_route_metrics(routeList, sampleRates, noiseLevels, saveResults=False)
Iterating through each our 200 routes at 5 sample rates and 21 levels of noise, we simulate 21,000 distinct GPS traces. We then pass each of these simulated traces through to the map-matching service, and compare the resulting routes against the ground-truthed, pre-perturbation route segments.
The previous step will typically take a long time to run if you are generating a lot of routes (> 10), so it's a good idea to save your dataframes for later use.
In [11]:
matchDf.to_csv('{0}_{1}_matches.csv'.format(cityName, str(numRoutes)), index=False)
Ensure that the Reporter is not failing frequently for any particular sample rate or noise level
In [14]:
val.plot_pattern_failure(matchDf, sampleRates, noiseLevels)
The graphs below represent the median scores for 6 error metrics applied to each of our 21,000 routes, broken down by sample rate and noise level. Plots in the left column are based solely on error rate, i.e the percentage of Type I, Type II, or Type I/II mismatches. The right-hand column shows the same metrics as the left, but weighted by segment length. The top right plot thus represents the metric used by Newton and Krumm, and the two plots below it represent the same value broken out by error type.
In [15]:
val.plot_distance_metrics(matchDf, sampleRates)
There is a lot of valuable information to be gleaned from our validation process. The main takeaways are summarized below:
Ultimately, our validation allows us to state with a greater measure of confidence that our map-matching is working as intended, at least in San Francisco:
In the next installment of this series, we will see how we can actually use these error metrics to optimize the performance of our map-matching algorithm based on the quality of the data made available to us.