Pandana network accessibility demo

This notebook uses pandana (v0.2) to download street network and points-of-interest data from OpenStreetMap and then calculate network accessibility to the points of interest. Note: pandana currently only runs on Python 2.

For a simpler demo, check out pandana-accessibility-demo-simple.ipynb


In [1]:
import pandana, time, os, pandas as pd, numpy as np
from pandana.loaders import osm
%matplotlib inline

1. Set up configuration of script


In [2]:
# configure search at a max distance of 1 km for up to the 10 nearest points-of-interest
amenities = ['restaurant', 'bar', 'school']
distance = 1000
num_pois = 10
num_categories = len(amenities) + 1 #one for each amenity, plus one extra for all of them combined

# bounding box as a list of llcrnrlat, llcrnrlng, urcrnrlat, urcrnrlng
bbox = [37.76, -122.35, 37.9, -122.17] #lat-long bounding box for berkeley/oakland

In [3]:
# configure filenames to save/load POI and network datasets
bbox_string = '_'.join([str(x) for x in bbox])
net_filename = 'data/network_{}.h5'.format(bbox_string)
poi_filename = 'data/pois_{}_{}.csv'.format('_'.join(amenities), bbox_string)

In [4]:
# keyword arguments to pass for the matplotlib figure
bbox_aspect_ratio = (bbox[2] - bbox[0]) / (bbox[3] - bbox[1])
fig_kwargs = {'facecolor':'w', 
              'figsize':(10, 10 * bbox_aspect_ratio)}

# keyword arguments to pass for scatter plots
plot_kwargs = {'s':5, 
               'alpha':0.9, 
               'cmap':'viridis_r', 
               'edgecolor':'none'}

# network aggregation plots are the same as regular scatter plots, but without a reversed colormap
agg_plot_kwargs = plot_kwargs.copy()
agg_plot_kwargs['cmap'] = 'viridis'

# keyword arguments to pass for hex bin plots
hex_plot_kwargs = {'gridsize':60,
                   'alpha':0.9, 
                   'cmap':'viridis_r', 
                   'edgecolor':'none'}

# keyword arguments to pass to make the colorbar
cbar_kwargs = {}

# keyword arguments to pass to basemap
bmap_kwargs = {}

# color to make the background of the axis
bgcolor = 'k'

2. Download points of interest (POIs) and network data from OSM

First get the points of interest - either load an existing set for the specified amenities and bounding box from CSV, or get it from the OSM API.


In [5]:
start_time = time.time()
if os.path.isfile(poi_filename):
    # if a points-of-interest file already exists, just load the dataset from that
    pois = pd.read_csv(poi_filename)
    method = 'loaded from CSV'
else:   
    # otherwise, query the OSM API for the specified amenities within the bounding box 
    osm_tags = '"amenity"~"{}"'.format('|'.join(amenities))
    pois = osm.node_query(bbox[0], bbox[1], bbox[2], bbox[3], tags=osm_tags)
    
    # using the '"amenity"~"school"' returns preschools etc, so drop any that aren't just 'school' then save to CSV
    pois = pois[pois['amenity'].isin(amenities)]
    pois.to_csv(poi_filename, index=False, encoding='utf-8')
    method = 'downloaded from OSM'
    
print('{:,} POIs {} in {:,.2f} seconds'.format(len(pois), method, time.time()-start_time))
pois[['amenity', 'name', 'lat', 'lon']].head()


944 POIs loaded from CSV in 0.01 seconds
Out[5]:
amenity name lat lon
0 restaurant Berkeley Thai House 37.867150 -122.258297
1 restaurant Herbivore The Earthly Grill 37.864796 -122.267140
2 restaurant Pat Brown's Grille 37.873327 -122.264884
3 restaurant Tako Sushi 37.867231 -122.258692
4 restaurant Julie's Healthy Cafe 37.868647 -122.257546

In [6]:
# how many points of interest of each type of amenity did we retrieve?
pois['amenity'].value_counts()


Out[6]:
restaurant    761
school        103
bar            80
Name: amenity, dtype: int64

Next get the street network data - either load an existing dataset for the specified bounding box from HDF5, or get it fresh from the OSM API.


In [7]:
start_time = time.time()
if os.path.isfile(net_filename):
    # if a street network file already exists, just load the dataset from that
    network = pandana.network.Network.from_hdf5(net_filename)
    method = 'loaded from HDF5'
else:
    # otherwise, query the OSM API for the street network within the specified bounding box
    network = osm.network_from_bbox(bbox[0], bbox[1], bbox[2], bbox[3])
    method = 'downloaded from OSM'
    
    # identify nodes that are connected to fewer than some threshold of other nodes within a given distance
    lcn = network.low_connectivity_nodes(impedance=1000, count=10, imp_name='distance')
    network.save_hdf5(net_filename, rm_nodes=lcn) #remove low-connectivity nodes and save to h5
    
print('Network with {:,} nodes {} in {:,.2f} secs'.format(len(network.node_ids), method, time.time()-start_time))


Network with 22,554 nodes loaded from HDF5 in 2.02 secs

When saving a Network to HDF5 it’s possible to exclude certain nodes. This can be useful when refining a network so that it includes only validated nodes. As an example, you can use the low_connectivity_nodes() method to identify nodes that may not be connected to the larger network, then exclude those nodes when saving to HDF5. In the current design of Pandana it’s not possible to modify a Network in place. Instead, save the network (with low-connectivity nodes removed) to HDF5 and then load it from that file by re-starting the notebook.

3. Calculate accessibility to any amenity we retrieved


In [8]:
# precomputes the range queries (the reachable nodes within this maximum distance)
# so, as long as you use a smaller distance, cached results will be used
network.precompute(distance + 1)

In [9]:
# initialize the underlying C++ points-of-interest engine
network.init_pois(num_categories=num_categories, max_dist=distance, max_pois=num_pois)

In [10]:
# initialize a category for all amenities with the locations specified by the lon and lat columns
network.set_pois(category='all', x_col=pois['lon'], y_col=pois['lat'])

In [11]:
# searches for the n nearest amenities (of all types) to each node in the network
all_access = network.nearest_pois(distance=distance, category='all', num_pois=num_pois)

# it returned a df with the number of columns equal to the number of POIs that are requested
# each cell represents the network distance from the node to each of the n POIs
print('{:,} nodes'.format(len(all_access)))
all_access.head()


22,554 nodes
Out[11]:
1 2 3 4 5 6 7 8 9 10
id
30366199 282.615997 399.686005 434.403992 521.252014 611.322998 822.515991 872.346985 1000.0 1000.0 1000.0
30366200 292.700012 319.678986 354.397003 441.244995 531.315979 742.508972 792.340027 1000.0 1000.0 1000.0
30374146 639.367004 655.778992 655.778992 655.778992 655.778992 921.804016 1000.000000 1000.0 1000.0 1000.0
33917655 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.0 1000.0 1000.0
33947074 603.728027 655.109009 682.762024 777.801025 1000.000000 1000.000000 1000.000000 1000.0 1000.0 1000.0

4. Plot accessibility from each node to any amenity


In [12]:
# distance to the nearest amenity of any type
n = 1
bmap, fig, ax = network.plot(all_access[n], bbox=bbox, plot_kwargs=plot_kwargs, fig_kwargs=fig_kwargs, 
                             bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to nearest amenity around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-all-east-bay.png', dpi=200, bbox_inches='tight')



In [13]:
# distance to the 5th nearest amenity of any type
n = 5
bmap, fig, ax = network.plot(all_access[n], bbox=bbox, plot_kwargs=plot_kwargs, fig_kwargs=fig_kwargs, 
                             bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to 5th nearest amenity around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-all-5th-east-bay.png', dpi=200, bbox_inches='tight')



In [14]:
# distance to the nearest amenity of any type, as hexbins
bmap, fig, ax = network.plot(all_access[1], bbox=bbox, plot_type='hexbin', plot_kwargs=hex_plot_kwargs, 
                             fig_kwargs=fig_kwargs, bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to nearest amenity around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-all-hexbin-east-bay.png', dpi=200, bbox_inches='tight')


5. Calculate and plot accessibility separately for each amenity type

The amenity types specified at the beginning area: restaurants, bars, and schools


In [15]:
# initialize each amenity category with the locations specified by the lon and lat columns
for amenity in amenities:
    pois_subset = pois[pois['amenity']==amenity]
    network.set_pois(category=amenity, x_col=pois_subset['lon'], y_col=pois_subset['lat'])

In [16]:
# distance to the nearest restaurant
restaurant_access = network.nearest_pois(distance=distance, category='restaurant', num_pois=num_pois)
bmap, fig, ax = network.plot(restaurant_access[1], bbox=bbox, plot_kwargs=plot_kwargs, 
                             fig_kwargs=fig_kwargs, bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to nearest restaurant around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-restaurant-east-bay.png', dpi=200, bbox_inches='tight')



In [17]:
# distance to the nearest bar
bar_access = network.nearest_pois(distance=distance, category='bar', num_pois=num_pois)
bmap, fig, ax = network.plot(bar_access[1], bbox=bbox, plot_kwargs=plot_kwargs, 
                             fig_kwargs=fig_kwargs, bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to nearest bar around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-bar-east-bay.png', dpi=200, bbox_inches='tight')



In [18]:
# distance to the nearest school
school_access = network.nearest_pois(distance=distance, category='school', num_pois=num_pois)
bmap, fig, ax = network.plot(school_access[1], bbox=bbox, plot_kwargs=plot_kwargs, 
                             fig_kwargs=fig_kwargs, bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)
ax.set_title('Walking distance (m) to nearest school around Berkeley/Oakland', fontsize=15)
fig.savefig('images/accessibility-school-east-bay.png', dpi=200, bbox_inches='tight')


6. Network aggregation queries

Normally you'd want to do this with population or square footage or something similar. I'm just using 'capacity' because it's a variable that I can pull from OSM. But, only like one POI in this dataset has a positive value for capacity (and its value is 100 and it's in downtown Oakland).


In [19]:
# first get the nearest node_id for each POI, based on its lat-lon
pois['node_ids'] = network.get_node_ids(pois['lon'], pois['lat'])

In [20]:
# attach the value of the capacity variable to each corresponding node in the network
network.set(pois['node_ids'], variable=pois['capacity'], name='capacity')


Removed 942 rows because they contain missing values

In [21]:
# sum (with linear decay) the capacities within specified distance from each node in network
capacity_sum = network.aggregate(distance=distance, type='sum', decay='linear', name='capacity')

In [22]:
# plot the accessibility to 'capacity'
bmap, fig, ax = network.plot(capacity_sum, bbox=bbox, plot_type='scatter', plot_kwargs=agg_plot_kwargs, 
                             fig_kwargs=fig_kwargs, bmap_kwargs=bmap_kwargs, cbar_kwargs=cbar_kwargs)
ax.set_axis_bgcolor(bgcolor)


For a simpler demo, check out pandana-accessibility-demo-simple.ipynb