Visualize traffic incident reports in San Francisco.
Data sources:
In [1]:
from cartoframes.auth import set_default_credentials, Credentials
from cartoframes.viz import Map, Layer, Source
import pandas as pd
import geopandas as gpd
If you have a CARTO account, you can set your credentials in the following cell. This allows you to upload the dataset and share the final visualization through your account.
In [2]:
set_default_credentials('creds.json')
In [3]:
incident_reports_df = pd.read_csv('http://data.sfgov.org/resource/wg3w-h783.csv')
incident_reports_df.head()
Out[3]:
In [4]:
incident_reports_df.columns
Out[4]:
Some of the latitude and longitude values are NaN, in the next step we get rid of them. After that, we create a dataset from the dataframe and use it in a Layer to visualize the data:
In [5]:
incident_reports_df = incident_reports_df[incident_reports_df.longitude == incident_reports_df.longitude]
incident_reports_df = incident_reports_df[incident_reports_df.latitude == incident_reports_df.latitude]
incident_reports_gdf = gpd.GeoDataFrame(
incident_reports_df,
geometry=gpd.points_from_xy(incident_reports_df.longitude, incident_reports_df.latitude)
)
incident_reports_gdf.head()
Out[5]:
In [7]:
Layer(incident_reports_gdf)
Out[7]:
Now, we are going to use a helper method to color by category, and the category is 'Day of Week' (incident_day_of_week)
In [20]:
from cartoframes.viz import Layer, color_category_style
Layer(incident_reports_gdf, color_category_style('incident_day_of_week', top=7), title='Day of Week')
Out[20]:
As we can see in the legend, the days are sorted by frequency, which means that there're less incidents on Thursdays and More on Tuesdays. Since our purpose is not to visualize the frequency and we want to see the days properly sorted from Monday to Sunday in the legend, we can modify the helper and set the categories we want to visualize in the desired position:
In [21]:
from cartoframes.viz import color_category_style
Layer(
incident_reports_gdf,
color_category_style(
'incident_day_of_week',
cat=[
'Monday',
'Tuesday',
'Wednesday',
'Thursday',
'Friday',
'Saturday',
'Sunday'
]
),
title='Day of Week'
)
Out[21]:
Now, we want to look for traffic incidents, and then use these categories to visualize those incidents:
In [10]:
incident_reports_df.incident_category.unique()
Out[10]:
In [22]:
from cartoframes.viz import Layer, size_category_style
Layer(
incident_reports_gdf,
size_category_style(
'incident_category',
cat=['Traffic Collision', 'Traffic Violation Arrest']
),
title='Traffic Incidents'
)
Out[22]:
In CARTO we have a dataset we can use for the next step, named 'sfcta_congestion_roads'. We are going to set the Credentials for this dataset. To have more control over this dataset, if you have a CARTO account you can import it to have everything together, and it won't be needed to create a different source for this Dataset.
Once we've the data source created, we're going to combine two helper methods. The first one uses the Source with the roads data from CARTO, and the second one the traffic incident reports.
In [23]:
from cartoframes.viz import Layer, color_continuous_style, size_category_style
sfcta_congestion_roads_source=Source(
'sfcta_congestion_roads',
Credentials(
base_url='https://cartovl.carto.com',
api_key='default_public'
)
)
Map([
Layer(
sfcta_congestion_roads_source,
color_continuous_style('auto_speed'),
title='Recorded vehicle speeds'
),
Layer(
incident_reports_gdf,
size_category_style(
'incident_category',
cat=['Traffic Collision', 'Traffic Violation Arrest']
),
title='Traffic Incidents'
)
])
Out[23]:
We are going to add information about traffic signals, by getting data from a different source:
In [30]:
traffic_signals_df = pd.read_csv('http://data.sfgov.org/resource/c8ue-f4py.csv')
traffic_signals_df.head()
Out[30]:
In [31]:
traffic_signals_df.columns
Out[31]:
In [32]:
traffic_signals_df.code.unique()
Out[32]:
Since there is no latitude and longitude columns, we can use the point column to create a GeoDataFrame.
In [33]:
from shapely import wkt
traffic_signals_df['point'] = traffic_signals_df['point'].apply(wkt.loads)
traffic_signals_df = traffic_signals_df.rename(columns={'point': 'geometry'}).set_geometry('geometry')
trafic_signals_gdf = gpd.GeoDataFrame(traffic_signals_df, geometry='geometry')
In [34]:
Map(Layer(trafic_signals_gdf))
Out[34]:
In [36]:
from cartoframes.viz import Layer, color_continuous_style, size_category_style, basic_style
Map([
Layer(
sfcta_congestion_roads_source,
color_continuous_style('auto_speed'),
title='Recorded vehicle speeds'
),
Layer(
incident_reports_gdf,
size_category_style(
'incident_category',
cat=['Traffic Collision', 'Traffic Violation Arrest']
),
title='Traffic Incidents'
),
Layer(
trafic_signals_gdf,
basic_style(color='blue', size=1),
title='Traffic Signals'
)
],
layer_selector=True)
Out[36]: