Author: UrbanSim
This notebook provides a brief overview of the main functionality of UrbanAccess with examples using AC Transit and BART GTFS data and OpenStreetMap (OSM) pedestrian network data to create an integrated transit and pedestrian network for Oakland, CA for use in Pandana network accessibility queries.
UrbanAccess on UDST: https://github.com/UDST/urbanaccess
UrbanAccess documentation: https://udst.github.io/urbanaccess/index.html
UrbanAccess citation:
Samuel D. Blanchard and Paul Waddell, 2017, "UrbanAccess: Generalized Methodology for Measuring Regional Accessibility with an Integrated Pedestrian and Transit Network" Transportation Research Record: Journal of the Transportation Research Board, 2653: 35–44.
Notes:
For UrbanAccess installation instructions see: https://udst.github.io/urbanaccess/installation.html
This notebook contains optional Pandana examples which require the installation of Pandana, for instructions see here: http://udst.github.io/pandana/installation.html
In [ ]:
import pandas as pd
import pandana as pdna
import time
import urbanaccess as ua
from urbanaccess.config import settings
from urbanaccess.gtfsfeeds import feeds
from urbanaccess import gtfsfeeds
from urbanaccess.gtfs.gtfsfeeds_dataframe import gtfsfeeds_dfs
from urbanaccess.network import ua_network, load_network
%matplotlib inline
In [ ]:
# Pandana currently uses depreciated parameters in matplotlib, this hides the warning until its fixed
import warnings
import matplotlib.cbook
warnings.filterwarnings("ignore",category=matplotlib.cbook.mplDeprecation)
The settings
object is a global urbanaccess_config
object that can be used to set default options in UrbanAccess. In general, these options do not need to be changed.
In [ ]:
settings.to_dict()
For example, you can stop printing in notebooks and only print to console by setting:
In [ ]:
settings.log_console = True
turn on printing for now
In [ ]:
settings.log_console = False
The GTFS feeds
object is a global urbanaccess_gtfsfeeds
object that allows you to save and manage information needed to download multiple GTFS feeds. This object is a dictionary of the names of GTFS feeds or agencies and the URLs to use to download the corresponding feeds.
In [ ]:
feeds.to_dict()
You can use the search function to find feeds on the GTFS Data Exchange (Note: the GTFS Data Exchange is no longer being maintained as of Summer 2016 so feeds here may be out of date)
Let's search for feeds for transit agencies in the GTFS Data Exchange that we know serve Oakland, CA: 1) Bay Area Rapid Transit District (BART) which runs the metro rail service and 2) AC Transit which runs bus services.
Let's start by finding the feed for the Bay Area Rapid Transit District (BART) by using the search term Bay Area Rapid Transit
:
In [ ]:
gtfsfeeds.search(search_text='Bay Area Rapid Transit',
search_field=None,
match='contains')
Now that we see what can be found on the GTFS Data Exchange. Let's run this again but this time let's add the feed from your search to the feed download list
In [ ]:
gtfsfeeds.search(search_text='Bay Area Rapid Transit',
search_field=None,
match='contains',
add_feed=True)
If you know of a GTFS feed located elsewhere or one that is more up to date, you can add additional feeds located at custom URLs by adding a dictionary with the key as the name of the service/agency and the value as the URL.
Let's do this for AC Transit which also operates in Oakland, CA.
The link to their feed is here: http://www.actransit.org/planning-focus/data-resource-center/ and let's get the latest version as of June 18, 2017
In [ ]:
feeds.add_feed(add_dict={'ac transit': 'http://www.actransit.org/wp-content/uploads/GTFSJune182017B.zip'})
Note the two GTFS feeds now in your feeds object ready to download
In [ ]:
feeds.to_dict()
Use the download function to download all the feeds in your feeds object at once. If no parameters are specified the existing feeds object will be used to acquire the data.
By default, your data will be downloaded into the directory of this notebook in the folder: data
In [ ]:
gtfsfeeds.download()
Now that we have downloaded our data let's load our individual GTFS feeds (currently a series of text files stored on disk) into a combined network of Pandas DataFrames.
gtfsfeed_path
parameter. If you want to aggregate multiple transit networks together, all the GTFS feeds you want to aggregate must be inside of a single root folder.validation
and set a bounding box with the remove_stops_outsidebbox
parameter turned on to ensure all your GTFS feed data are within a specified area.Let's specify a bounding box of coordinates for the City of Oakland to subset the GTFS data to. You can generate a bounding box by going to http://boundingbox.klokantech.com/ and selecting the CSV format.
In [ ]:
validation = True
verbose = True
# bbox for City of Oakland
bbox = (-122.355881,37.632226,-122.114775,37.884725)
remove_stops_outsidebbox = True
append_definitions = True
loaded_feeds = ua.gtfs.load.gtfsfeed_to_df(gtfsfeed_path=None,
validation=validation,
verbose=verbose,
bbox=bbox,
remove_stops_outsidebbox=remove_stops_outsidebbox,
append_definitions=append_definitions)
The output is a global urbanaccess_gtfs_df
object that can be accessed with the specified variable loaded_feeds
. This object holds all the individual GTFS feed files aggregated together with each GTFS feed file type in separate Pandas DataFrames to represent all the loaded transit feeds in a metropolitan area.
In [ ]:
loaded_feeds.stops.head()
Note the two transit services we have aggregated into one regional table
In [ ]:
loaded_feeds.stops.unique_agency_id.unique()
Quickly view the transit stop locations
In [ ]:
loaded_feeds.stops.plot(kind='scatter', x='stop_lon', y='stop_lat', s=0.1)
In [ ]:
loaded_feeds.routes.head()
In [ ]:
loaded_feeds.stop_times.head()
In [ ]:
loaded_feeds.trips.head()
In [ ]:
loaded_feeds.calendar.head()
Now that we have loaded and standardized our GTFS data, let's create a travel time weighted graph from the GTFS feeds we have loaded.
Create a network for weekday monday
service between 7 am and 10 am (['07:00:00', '10:00:00']
) to represent travel times during the AM Peak period.
Assumptions: We are using the service ids in the calendar
file to subset the day of week, however if your feed uses the calendar_dates
file and not the calendar
file then you can use the calendar_dates_lookup
parameter. This is not required for AC Transit and BART.
In [ ]:
ua.gtfs.network.create_transit_net(gtfsfeeds_dfs=loaded_feeds,
day='monday',
timerange=['07:00:00', '10:00:00'],
calendar_dates_lookup=None)
The output is a global urbanaccess_network
object. This object holds the resulting graph comprised of nodes and edges for the processed GTFS network data for services operating at the day and time you specified inside of transit_edges
and transit_nodes
.
Let's set the global network object to a variable called urbanaccess_net
that we can then inspect:
In [ ]:
urbanaccess_net = ua.network.ua_network
In [ ]:
urbanaccess_net.transit_edges.head()
In [ ]:
urbanaccess_net.transit_nodes.head()
In [ ]:
urbanaccess_net.transit_nodes.plot(kind='scatter', x='x', y='y', s=0.1)
Now let's download OpenStreetMap (OSM) pedestrian street network data to produce a graph network of nodes and edges for Oakland, CA. We will use the same bounding box as before.
In [ ]:
nodes, edges = ua.osm.load.ua_network_from_bbox(bbox=bbox,
remove_lcn=True)
Now that we have our pedestrian network data let's create a travel time weighted graph from the pedestrian network we have loaded and add it to our existing UrbanAccess network object. We will assume a pedestrian travels on average at 3 mph.
The resulting weighted network will be added to your UrbanAccess network object inside osm_nodes
and osm_edges
In [ ]:
ua.osm.network.create_osm_net(osm_edges=edges,
osm_nodes=nodes,
travel_speed_mph=3)
Let's inspect the results which we can access inside of the existing urbanaccess_net
variable:
In [ ]:
urbanaccess_net.osm_nodes.head()
In [ ]:
urbanaccess_net.osm_edges.head()
In [ ]:
urbanaccess_net.osm_nodes.plot(kind='scatter', x='x', y='y', s=0.1)
Now let's integrate the two networks together. The resulting graph will be added to your existing UrbanAccess network object. After running this step, your network will be ready to be used with Pandana.
The resulting integrated network will be added to your UrbanAccess network object inside net_nodes
and net_edges
In [ ]:
ua.network.integrate_network(urbanaccess_network=urbanaccess_net,
headways=False)
Let's inspect the results which we can access inside of the existing urbanaccess_net
variable:
In [ ]:
urbanaccess_net.net_nodes.head()
In [ ]:
urbanaccess_net.net_edges.head()
In [ ]:
urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type'] == 'transit'].head()
You can save the final processed integrated network net_nodes
and net_edges
to disk inside of a HDF5 file. By default the file will be saved to the directory of this notebook in the folder data
In [ ]:
ua.network.save_network(urbanaccess_network=urbanaccess_net,
filename='final_net.h5',
overwrite_key = True)
You can load an existing processed integrated network HDF5 file from disk into a UrbanAccess network object.
In [ ]:
urbanaccess_net = ua.network.load_network(filename='final_net.h5')
You can visualize the network you just created using basic UrbanAccess plot functions
In [ ]:
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=1.1, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Use the col_colors
function to color edges by travel time. In this case the darker red the higher the travel times.
Note the ability to see AC Transit's major bus arterial routes (in darker red) and transfer locations and BART rail network (rail stations are visible by the multiple bus connections at certain junctions in the network most visible in downtown Oakland at 19th, 12th Street, and Lake Merritt stations and Fruitvale and Coliseum stations) with the underlying pedestrian network. Downtown Oakland is located near the white cutout in the northeast middle section of the network which represents Lake Merritt.
In [ ]:
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Let's zoom in closer to downtown Oakland using a new smaller extent bbox. Note the bus routes on the major arterials and the BART routes from station to station.
In [ ]:
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=(-122.282295, 37.795, -122.258434, 37.816022),
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
You can also slice the network by network type
In [ ]:
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type']=='transit'],
bbox=None,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
In [ ]:
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type']=='walk'],
bbox=None,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
You can slice the network using any attribute in edges. In this case let's examine one route for AC Transit route 51A.
Looking at what routes are in the network for 51A we see route id: 51A-141_ac_transit
In [ ]:
urbanaccess_net.net_edges['unique_route_id'].unique()
In [ ]:
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['unique_route_id']=='51A-141_ac_transit'],
bbox=bbox,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
We can also slice the data by agency. In this case let's view all BART routes.
Looking at what agencies are in the network for BART we see agency id: bay_area_rapid_transit
In [ ]:
urbanaccess_net.net_edges['unique_agency_id'].unique()
In [ ]:
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['unique_agency_id']=='bay_area_rapid_transit'],
bbox=bbox,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
The network we have generated so far only contains pure travel times. UrbanAccess allows for the calculation of and addition of route stop level average headways to the network. This is used as a proxy for passenger wait times at stops and stations. The route stop level average headway are added to the pedestrian to transit connector edges.
Let's calculate headways for the same AM Peak time period. Statistics on route stop level headways will be added to your GTFS transit data object inside of headways
In [ ]:
ua.gtfs.headways.headways(gtfsfeeds_df=loaded_feeds,
headway_timerange=['07:00:00','10:00:00'])
In [ ]:
loaded_feeds.headways.head()
Now that headways have been calculated and added to your GTFS transit feed object, you can use them to generate a new integrated network that incorporates the headways within the pedestrian to transit connector edge travel times.
In [ ]:
ua.network.integrate_network(urbanaccess_network=urbanaccess_net,
headways=True,
urbanaccess_gtfsfeeds_df=loaded_feeds,
headway_statistic='mean')
In [ ]:
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Pandana (Pandas Network Analysis) is a tool to compute network accessibility metrics.
Now that we have an integrated transit and pedestrian network that has been formatted for use with Pandana, we can now use Pandana right away to compute accessibility metrics.
There are a couple of things to remember about UrbanAccess and Pandana:
two_way
parameters to False
(they are True
by default) to indicate that the network is a one way network.node ids
and from
and to
columns in your network must be integer type and not string. UrbanAccess automatically generates both string and integer types so use the from_int
and to_int
columns in edges and the index in nodes id_int
.For more on Pandana see the:
Pandana repo: https://github.com/UDST/pandana
Pandana documentation: http://udst.github.io/pandana/
Let's load 2010 Census block data for the 9 county Bay Area. Note: These data have been processed from original Census and LEHD data.
The data is located in the demo
folder on the repo with this notebook.
In [ ]:
blocks = pd.read_hdf('bay_area_demo_data.h5','blocks')
# remove blocks that contain all water
blocks = blocks[blocks['square_meters_land'] != 0]
print('Total number of blocks: {:,}'.format(len(blocks)))
blocks.head()
Let's subset the Census data to just be the bounding box for Oakland
In [ ]:
lng_max, lat_min, lng_min, lat_max = bbox
outside_bbox = blocks.loc[~(((lng_max < blocks["x"]) & (blocks["x"] < lng_min)) & ((lat_min < blocks["y"]) & (blocks["y"] < lat_max)))]
blocks_subset = blocks.drop(outside_bbox.index)
print('Total number of subset blocks: {:,}'.format(len(blocks_subset)))
In [ ]:
blocks_subset.plot(kind='scatter', x='x', y='y', s=0.1)
Let's initialize our Pandana network object using our transit and pedestrian network we created. Note: the from_int
and to_int
as well as the twoway=False
denoting this is a explicit one way network.
In [ ]:
s_time = time.time()
transit_ped_net = pdna.Network(urbanaccess_net.net_nodes["x"],
urbanaccess_net.net_nodes["y"],
urbanaccess_net.net_edges["from_int"],
urbanaccess_net.net_edges["to_int"],
urbanaccess_net.net_edges[["weight"]],
twoway=False)
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Now let's set our blocks on to the network
In [ ]:
blocks_subset['node_id'] = transit_ped_net.get_node_ids(blocks_subset['x'], blocks_subset['y'])
Now let's compute an accessibility metric, in this case a cumulative accessibility metric. See Pandana for other metrics that can be calculated.
Let's set the block variables we want to use as our accessibly metric on the Pandana network. In this case let's use jobs
In [ ]:
transit_ped_net.set(blocks_subset.node_id, variable = blocks_subset.jobs, name='jobs')
Now let's run an cumulative accessibility query using our network and the jobs variable for three different travel time thresholds: 15, 30, 45 minutes.
Note: Depending on network size, radius threshold, computer processing power, and whether or not you are using multiple cores the compute process may take some time.
In [ ]:
s_time = time.time()
jobs_45 = transit_ped_net.aggregate(45, type='sum', decay='linear', name='jobs')
jobs_30 = transit_ped_net.aggregate(30, type='sum', decay='linear', name='jobs')
jobs_15 = transit_ped_net.aggregate(15, type='sum', decay='linear', name='jobs')
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Quickly visualize the accessibility query results. As expected, note that a travel time of 15 minutes results in a lower number of jobs accessible at each network node.
In [ ]:
print(jobs_45.head())
print(jobs_30.head())
print(jobs_15.head())
Note how the radius of the number of jobs accessible expands as the time threshold increases where high accessibility is indicated in dark red. You can easily see downtown Oakland has the highest accessibility due to a convergence of transit routes and because downtown is where the majority of jobs in the area are located. Other high accessibility areas are visible elsewhere directly adjacent to BART metro rail stations of West Oakland, Fruitvale, and Coliseum and AC Transit bus routes on the main arterial road corridors.
In [ ]:
s_time = time.time()
transit_ped_net.plot(jobs_15,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'26943','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
In [ ]:
s_time = time.time()
transit_ped_net.plot(jobs_30,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'26943','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
In [ ]:
s_time = time.time()
transit_ped_net.plot(jobs_45,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'26943','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
In [ ]:
In [ ]:
In [ ]:
In [ ]: