notebook.community

Edit and run



In [1]:

    
import seaborn as sns
import metapack as mp
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display 

%matplotlib inline
sns.set_context('notebook')
mp.jupyter.init()



In [22]:

    
pkg = mp.jupyter.open_package()
#pkg = mp.jupyter.open_source_package()
pkg









    Out[22]:




San Diego Parking Time Series
sandiego.gov-cityiq_parking-2 Last Update: 2019-02-18T01:53:21
15 minute interval parking utilization for 1600 parking zones in San Diego city.
This datasets is compiled from parking events scraped from the San Diego CityIQ
smart streetmap system, via the cityiq Python
package. The dataset is compiled from PKIN
and PKOUT events between the dates of Sept 2018 and Feb 2019 for the whole SaN
Diego system.
The dataset is heavily processed to eliminate duplicate events because there
are many spurious events, but an excess of PKIN events. When computing the
number of cars parked in all parking zones, the excess of PKIN events results
in about 60,000 extra cars per month. These issues are explored in an Jupyter
Notebook
The records in this dataset referece parking zones. More information, including geographic positions, are avialble in the CityIQ Objects dataset.
Processing
These data were produced with these programs: 
$ pip install cityiq
$ ciq_config -w
$ # Edit .cityiq-config.yaml with client-id and secret
# Scrape PKIN and PKOUT from Sept 2018 to present
$ $ ciq_events -s -e PKIN -e PKOUT -t 20190901
# Split event dump in to event-location csv files
$ ciq_events -S
# Deduplicate and normalize
$ ciq_events -n

The last step, deduplication and normalization, involves these steps: 

Group events by event type, location and 1 second period and select only 1
  record from each group 
Collect runs of a events of one type and select only
  the first record of the run, up to a run of 4 minutes long 
For each location, compute the cumulative sum of in and outs ( calculating
  the number of cars in the zone ) then create a rolling 2-day average.
  Subtract off the average.

The third step is demonstrated in this image:

The blue line is the original utilization for a single location, showing the
larger number of PKIN events than PKOUT events. The red line is the 2-day
rolling average, and the green line is after subtracting the 2-dat rolling
average.
In the final dataset, the data for the blue line is in the  cs column, which is created from the cumulative sum of the delta column. The green line is the data in the cs_norm column, which is differentiated to create the delta_normcolumn. 
For most purpuses you should use cs_norm and delta_norm.
Contacts

Wrangler Eric Busboom, Civic Knowledge

References

parking_events. Parking events
assets. Data package with metadata about the parking zone locations.
locations. Data package with metadata about the parking zone locations.



In [23]:

    
assets = pkg.reference('assets').dataframe()
locations = pkg.reference('locations').dataframe()
prk = pkg.reference('parking_events').dataframe()

prk.columns  = [e.lower() for e in prk.columns]



In [148]:

    
len(locations)









    Out[148]:





6616



In [34]:

    
prk_loc = prk.merge(locations, on='locationuid')



In [95]:

    
df = prk_loc[prk_loc.community_name == 'Downtown'].copy()



In [112]:

    
df.community_name.value_counts()









    Out[112]:





Downtown    9477418
Name: community_name, dtype: int64



In [75]:

    
t = df[df.locationuid == 'ngxb22vyhqcjksjde40'].groupby([df.time.dt.hour, df.time.dt.weekday]).sum()
fig, ax = plt.subplots(figsize=(4, 12)) 
sns.heatmap(t[['delta_norm']].unstack(), ax=ax,  square=True, cmap="BrBG")









    Out[75]:





<matplotlib.axes._subplots.AxesSubplot at 0x131eaf128>



In [134]:

    
df['month'] = df.time.apply( lambda v: v.date().replace(day=15))
df.head()









    Out[134]:







  
    
      
      time
      locationuid
      delta
      cs
      delta_norm
      cs_norm
      locationtype
      parentlocationuid
      community_name
      tract_geoid
      roadsegid
      speed
      oneway
      abloaddr
      abhiaddr
      rd30full
      geometry
      flow
      month
    
  
  
    
      0
      2018-09-01 12:30:00
      v645089nhwojixbldr0
      -1
      -1
      0
      4
      PARKING_ZONE
      v645089nhwojixbldr0
      Downtown
      14000US00000006937
      38442.0
      20.0
      B
      1100.0
      1199.0
      J ST
      POLYGON ((-117.1539287914682 32.70946929697975...
      0
      2018-09-15
    
    
      1
      2018-09-01 12:45:00
      v645089nhwojixbldr0
      0
      -1
      0
      4
      PARKING_ZONE
      v645089nhwojixbldr0
      Downtown
      14000US00000006937
      38442.0
      20.0
      B
      1100.0
      1199.0
      J ST
      POLYGON ((-117.1539287914682 32.70946929697975...
      0
      2018-09-15
    
    
      2
      2018-09-01 13:00:00
      v645089nhwojixbldr0
      0
      -1
      0
      4
      PARKING_ZONE
      v645089nhwojixbldr0
      Downtown
      14000US00000006937
      38442.0
      20.0
      B
      1100.0
      1199.0
      J ST
      POLYGON ((-117.1539287914682 32.70946929697975...
      0
      2018-09-15
    
    
      3
      2018-09-01 13:15:00
      v645089nhwojixbldr0
      2
      1
      2
      6
      PARKING_ZONE
      v645089nhwojixbldr0
      Downtown
      14000US00000006937
      38442.0
      20.0
      B
      1100.0
      1199.0
      J ST
      POLYGON ((-117.1539287914682 32.70946929697975...
      1
      2018-09-15
    
    
      4
      2018-09-01 13:30:00
      v645089nhwojixbldr0
      -2
      -1
      -2
      4
      PARKING_ZONE
      v645089nhwojixbldr0
      Downtown
      14000US00000006937
      38442.0
      20.0
      B
      1100.0
      1199.0
      J ST
      POLYGON ((-117.1539287914682 32.70946929697975...
      -1
      2018-09-15



In [147]:

    
from matplotlib.pyplot import xticks, xlabel, suptitle
t = df.groupby([df.month, df.time.dt.hour]).sum()
fig, ax = plt.subplots(figsize=(8, 4)) 
ax = sns.heatmap(t[['delta_norm']].unstack(), ax=ax, cmap="BrBG");
locs, labels = xticks()
xticks(locs, [ f'{e}' for e in range(24)]);
xlabel("Hour");
suptitle("Parking Flow By Hour of Day and Month of Data");



In [142]:









    Out[142]:





<a list of 24 Text xticklabel objects>



In [98]:

    
df['flow'] = df.delta_norm.apply( lambda v : 0 if abs(v) < 2 else 1 if v > 0 else -1  )



In [122]:

    
t = df.groupby([df.time.dt.hour, df.time.dt.week+(df.time.dt.year*100)]).sum().copy()
t['flow'] = t.delta_norm.apply( lambda v : 1 if v > 0 else -1  )
fig, ax = plt.subplots(figsize=(8, 8)) 
sns.heatmap(t[['flow']].unstack(), ax=ax,  square=True, cmap="viridis")









    Out[122]:





<matplotlib.axes._subplots.AxesSubplot at 0x137fea0f0>



In [121]:

    
t = df.groupby([df.time.dt.hour, df.time.dt.dayofweek]).sum().copy()
t['flow'] = t.delta_norm.apply( lambda v : 1 if v > 0 else -1  )
fig, ax = plt.subplots(figsize=(8, 8)) 
sns.heatmap(t[['flow']].unstack(), ax=ax,  square=True, cmap="viridis")









    Out[121]:





<matplotlib.axes._subplots.AxesSubplot at 0x137f19828>



In [ ]:

	time	locationuid	delta	cs	delta_norm	cs_norm	locationtype	parentlocationuid	community_name	tract_geoid	roadsegid	speed	oneway	abloaddr	abhiaddr	rd30full	geometry	flow	month
0	2018-09-01 12:30:00	v645089nhwojixbldr0	-1	-1	0	4	PARKING_ZONE	v645089nhwojixbldr0	Downtown	14000US00000006937	38442.0	20.0	B	1100.0	1199.0	J ST	POLYGON ((-117.1539287914682 32.70946929697975...	0	2018-09-15
1	2018-09-01 12:45:00	v645089nhwojixbldr0	0	-1	0	4	PARKING_ZONE	v645089nhwojixbldr0	Downtown	14000US00000006937	38442.0	20.0	B	1100.0	1199.0	J ST	POLYGON ((-117.1539287914682 32.70946929697975...	0	2018-09-15
2	2018-09-01 13:00:00	v645089nhwojixbldr0	0	-1	0	4	PARKING_ZONE	v645089nhwojixbldr0	Downtown	14000US00000006937	38442.0	20.0	B	1100.0	1199.0	J ST	POLYGON ((-117.1539287914682 32.70946929697975...	0	2018-09-15
3	2018-09-01 13:15:00	v645089nhwojixbldr0	2	1	2	6	PARKING_ZONE	v645089nhwojixbldr0	Downtown	14000US00000006937	38442.0	20.0	B	1100.0	1199.0	J ST	POLYGON ((-117.1539287914682 32.70946929697975...	1	2018-09-15
4	2018-09-01 13:30:00	v645089nhwojixbldr0	-2	-1	-2	4	PARKING_ZONE	v645089nhwojixbldr0	Downtown	14000US00000006937	38442.0	20.0	B	1100.0	1199.0	J ST	POLYGON ((-117.1539287914682 32.70946929697975...	-1	2018-09-15