Single Station Analysis - Historic

Building on the API Exploration Notebook and the Filtering Observed Arrivals notebook. Let's explore a different approach for analyzing the data. Note that, I modified the api scraper to only retrieve the soonest time from the next subway API. This should (hopefully) help with some of the issues we were previously having. I made a new database and ran the API for a few hours on Sunday polling only St. George station (station_id == 10) at a poll frequency of once every 10 seconds. I will post the data online so that others can try it out.

Created by Rami on May 6/2018


In [36]:
import datetime
from psycopg2 import connect
import configparser
import pandas as pd
import pandas.io.sql as pandasql
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.widgets import Slider

In [12]:
%matplotlib qt

In [13]:
try:
    con.close()
except:
    print("No existing connection... moving on")

In [14]:
CONFIG = configparser.ConfigParser(interpolation=None)
CONFIG.read('../db.cfg')
dbset = CONFIG['DBSETTINGS']
con = connect(**dbset)

Retrieving data from the database

Let's start by getting data from our database by joining along requestid to simplify things for us we're only going to look at Southbound trains for now.


In [15]:
sql = '''SELECT requestid, stationid, lineid, create_date, request_date, station_char, subwayline, system_message_type, 
            timint, traindirection, trainid, train_message
FROM requests
INNER JOIN ntas_data USING (requestid)
WHERE request_date >= '2017-06-14'::DATE + interval '6 hours 5 minutes' 
AND request_date <  '2017-06-14'::DATE + interval '29 hours'
AND stationid = 10
AND traindirection = 'South'
ORDER BY request_date
'''

In [26]:
stg_south = pandasql.read_sql(sql, con)

In [27]:
stg_south


Out[27]:
requestid stationid lineid create_date request_date station_char subwayline system_message_type timint traindirection trainid train_message
0 5167447 10 1 2017-06-14 06:05:04 2017-06-14 06:05:01.866256 SGU2 YUS Normal 6.199715 South 137 Arriving
1 5167447 10 1 2017-06-14 06:05:04 2017-06-14 06:05:01.866256 SGU2 YUS Normal 11.552266 South 131 Arriving
2 5167447 10 1 2017-06-14 06:05:04 2017-06-14 06:05:01.866256 SGU2 YUS Normal 16.417125 South 838 Delayed
3 5167515 10 1 2017-06-14 06:06:06 2017-06-14 06:06:02.215526 SGU2 YUS Normal 11.040278 South 131 Arriving
4 5167515 10 1 2017-06-14 06:06:06 2017-06-14 06:06:02.215526 SGU2 YUS Normal 15.478819 South 141 Arriving
5 5167515 10 1 2017-06-14 06:06:06 2017-06-14 06:06:02.215526 SGU2 YUS Normal 16.417125 South 838 Delayed
6 5167583 10 1 2017-06-14 06:07:06 2017-06-14 06:07:02.313463 SGU2 YUS Normal 10.073480 South 131 Arriving
7 5167583 10 1 2017-06-14 06:07:06 2017-06-14 06:07:02.313463 SGU2 YUS Normal 14.007009 South 141 Arriving
8 5167583 10 1 2017-06-14 06:07:06 2017-06-14 06:07:02.313463 SGU2 YUS Normal 16.417125 South 838 Delayed
9 5167651 10 1 2017-06-14 06:08:10 2017-06-14 06:08:06.800012 SGU2 YUS Normal 5.233515 South 137 Arriving
10 5167651 10 1 2017-06-14 06:08:10 2017-06-14 06:08:06.800012 SGU2 YUS Normal 9.587593 South 131 Arriving
11 5167651 10 1 2017-06-14 06:08:10 2017-06-14 06:08:06.800012 SGU2 YUS Normal 13.501595 South 141 Arriving
12 5167719 10 1 2017-06-14 06:09:05 2017-06-14 06:09:02.082338 SGU2 YUS Normal 4.512351 South 137 Arriving
13 5167719 10 1 2017-06-14 06:09:05 2017-06-14 06:09:02.082338 SGU2 YUS Normal 8.684744 South 131 Arriving
14 5167719 10 1 2017-06-14 06:09:05 2017-06-14 06:09:02.082338 SGU2 YUS Normal 12.688393 South 141 Arriving
15 5167787 10 1 2017-06-14 06:10:05 2017-06-14 06:10:02.076938 SGU2 YUS Normal 3.729031 South 137 Arriving
16 5167787 10 1 2017-06-14 06:10:05 2017-06-14 06:10:02.076938 SGU2 YUS Normal 7.898435 South 131 Arriving
17 5167787 10 1 2017-06-14 06:10:05 2017-06-14 06:10:02.076938 SGU2 YUS Normal 11.552266 South 141 Arriving
18 5167855 10 1 2017-06-14 06:11:09 2017-06-14 06:11:07.520184 SGU2 YUS Normal 2.534529 South 137 Arriving
19 5167855 10 1 2017-06-14 06:11:09 2017-06-14 06:11:07.520184 SGU2 YUS Normal 7.039214 South 131 Arriving
20 5167855 10 1 2017-06-14 06:11:09 2017-06-14 06:11:07.520184 SGU2 YUS Normal 10.556781 South 141 Arriving
21 5167923 10 1 2017-06-14 06:12:09 2017-06-14 06:12:05.343393 SGU2 YUS Normal 1.664551 South 137 Arriving
22 5167923 10 1 2017-06-14 06:12:09 2017-06-14 06:12:05.343393 SGU2 YUS Normal 6.199715 South 131 Arriving
23 5167923 10 1 2017-06-14 06:12:09 2017-06-14 06:12:05.343393 SGU2 YUS Normal 9.885819 South 141 Arriving
24 5167991 10 1 2017-06-14 06:13:06 2017-06-14 06:13:02.211966 SGU2 YUS Normal 0.957729 South 137 Arriving
25 5167991 10 1 2017-06-14 06:13:06 2017-06-14 06:13:02.211966 SGU2 YUS Normal 5.696542 South 131 Arriving
26 5167991 10 1 2017-06-14 06:13:06 2017-06-14 06:13:02.211966 SGU2 YUS Normal 9.587593 South 141 Arriving
27 5168059 10 1 2017-06-14 06:14:06 2017-06-14 06:14:02.165433 SGU2 YUS Normal 0.000000 South 137 AtStation
28 5168059 10 1 2017-06-14 06:14:06 2017-06-14 06:14:02.165433 SGU2 YUS Normal 4.978515 South 131 Arriving
29 5168059 10 1 2017-06-14 06:14:06 2017-06-14 06:14:02.165433 SGU2 YUS Normal 8.684744 South 141 Arriving
... ... ... ... ... ... ... ... ... ... ... ... ...
3555 5248023 10 1 2017-06-15 01:50:14 2017-06-15 01:50:02.422329 SGU2 YUS Normal 0.580080 South 838 Arriving
3556 5248023 10 1 2017-06-15 01:50:14 2017-06-15 01:50:02.422329 SGU2 YUS Normal 3.309362 South 152 Arriving
3557 5248023 10 1 2017-06-15 01:50:14 2017-06-15 01:50:02.422329 SGU2 YUS Normal 4.318331 South 809 Arriving
3558 5248091 10 1 2017-06-15 01:51:14 2017-06-15 01:51:02.415042 SGU2 YUS Normal 0.000000 South 838 AtStation
3559 5248091 10 1 2017-06-15 01:51:14 2017-06-15 01:51:02.415042 SGU2 YUS Normal 2.422056 South 152 Arriving
3560 5248091 10 1 2017-06-15 01:51:14 2017-06-15 01:51:02.415042 SGU2 YUS Normal 3.564362 South 809 Arriving
3561 5248159 10 1 2017-06-15 01:52:12 2017-06-15 01:52:02.404154 SGU2 YUS Normal 1.159522 South 152 Arriving
3562 5248159 10 1 2017-06-15 01:52:12 2017-06-15 01:52:02.404154 SGU2 YUS Normal 2.422056 South 809 Arriving
3563 5248159 10 1 2017-06-15 01:52:12 2017-06-15 01:52:02.404154 SGU2 YUS Normal 8.723740 South 874 Arriving
3564 5248227 10 1 2017-06-15 01:53:10 2017-06-15 01:53:01.534627 SGU2 YUS Normal 0.283633 South 152 Arriving
3565 5248227 10 1 2017-06-15 01:53:10 2017-06-15 01:53:01.534627 SGU2 YUS Normal 1.371189 South 809 Arriving
3566 5248227 10 1 2017-06-15 01:53:10 2017-06-15 01:53:01.534627 SGU2 YUS Normal 7.572967 South 874 Arriving
3567 5248295 10 1 2017-06-15 01:54:14 2017-06-15 01:54:02.234853 SGU2 YUS Normal 0.000000 South 152 AtStation
3568 5248295 10 1 2017-06-15 01:54:14 2017-06-15 01:54:02.234853 SGU2 YUS Normal 0.736318 South 809 Arriving
3569 5248295 10 1 2017-06-15 01:54:14 2017-06-15 01:54:02.234853 SGU2 YUS Normal 6.747313 South 874 Arriving
3570 5248363 10 1 2017-06-15 01:55:14 2017-06-15 01:55:01.750195 SGU2 YUS Normal 0.365478 South 809 Arriving
3571 5248363 10 1 2017-06-15 01:55:14 2017-06-15 01:55:01.750195 SGU2 YUS Normal 5.940287 South 874 Arriving
3572 5248363 10 1 2017-06-15 01:55:14 2017-06-15 01:55:01.750195 SGU2 YUS Normal 14.831384 South 672 Arriving
3573 5248431 10 1 2017-06-15 01:56:13 2017-06-15 01:56:02.426207 SGU2 YUS Normal 0.000000 South 809 AtStation
3574 5248431 10 1 2017-06-15 01:56:13 2017-06-15 01:56:02.426207 SGU2 YUS Normal 5.645776 South 874 Arriving
3575 5248431 10 1 2017-06-15 01:56:13 2017-06-15 01:56:02.426207 SGU2 YUS Normal 13.427176 South 672 Arriving
3576 5248499 10 1 2017-06-15 01:57:11 2017-06-15 01:57:01.466266 SGU2 YUS Normal 0.000000 South 809 AtStation
3577 5248499 10 1 2017-06-15 01:57:11 2017-06-15 01:57:01.466266 SGU2 YUS Normal 5.013598 South 874 Arriving
3578 5248499 10 1 2017-06-15 01:57:11 2017-06-15 01:57:01.466266 SGU2 YUS Normal 12.408482 South 672 Arriving
3579 5248567 10 1 2017-06-15 01:58:14 2017-06-15 01:58:02.208761 SGU2 YUS Normal 0.000000 South 809 Delayed
3580 5248567 10 1 2017-06-15 01:58:14 2017-06-15 01:58:02.208761 SGU2 YUS Normal 4.318331 South 874 Arriving
3581 5248567 10 1 2017-06-15 01:58:14 2017-06-15 01:58:02.208761 SGU2 YUS Normal 11.535980 South 672 Arriving
3582 5248635 10 1 2017-06-15 01:59:14 2017-06-15 01:59:01.926096 SGU2 YUS Normal 0.000000 South 809 Delayed
3583 5248635 10 1 2017-06-15 01:59:14 2017-06-15 01:59:01.926096 SGU2 YUS Normal 3.309362 South 874 Arriving
3584 5248635 10 1 2017-06-15 01:59:14 2017-06-15 01:59:01.926096 SGU2 YUS Normal 10.582762 South 672 Arriving

3585 rows × 12 columns


In [28]:
stg_south_resamp = stg_south[stg_south.index % 3 == 0]

In [29]:
stg_south_resamp


Out[29]:
requestid stationid lineid create_date request_date station_char subwayline system_message_type timint traindirection trainid train_message
0 5167447 10 1 2017-06-14 06:05:04 2017-06-14 06:05:01.866256 SGU2 YUS Normal 6.199715 South 137 Arriving
3 5167515 10 1 2017-06-14 06:06:06 2017-06-14 06:06:02.215526 SGU2 YUS Normal 11.040278 South 131 Arriving
6 5167583 10 1 2017-06-14 06:07:06 2017-06-14 06:07:02.313463 SGU2 YUS Normal 10.073480 South 131 Arriving
9 5167651 10 1 2017-06-14 06:08:10 2017-06-14 06:08:06.800012 SGU2 YUS Normal 5.233515 South 137 Arriving
12 5167719 10 1 2017-06-14 06:09:05 2017-06-14 06:09:02.082338 SGU2 YUS Normal 4.512351 South 137 Arriving
15 5167787 10 1 2017-06-14 06:10:05 2017-06-14 06:10:02.076938 SGU2 YUS Normal 3.729031 South 137 Arriving
18 5167855 10 1 2017-06-14 06:11:09 2017-06-14 06:11:07.520184 SGU2 YUS Normal 2.534529 South 137 Arriving
21 5167923 10 1 2017-06-14 06:12:09 2017-06-14 06:12:05.343393 SGU2 YUS Normal 1.664551 South 137 Arriving
24 5167991 10 1 2017-06-14 06:13:06 2017-06-14 06:13:02.211966 SGU2 YUS Normal 0.957729 South 137 Arriving
27 5168059 10 1 2017-06-14 06:14:06 2017-06-14 06:14:02.165433 SGU2 YUS Normal 0.000000 South 137 AtStation
30 5168127 10 1 2017-06-14 06:15:05 2017-06-14 06:15:01.616962 SGU2 YUS Normal 4.257351 South 131 Arriving
33 5168195 10 1 2017-06-14 06:16:06 2017-06-14 06:16:02.414494 SGU2 YUS Normal 3.474031 South 131 Arriving
36 5168263 10 1 2017-06-14 06:17:05 2017-06-14 06:17:01.729177 SGU2 YUS Normal 2.534529 South 131 Arriving
39 5168331 10 1 2017-06-14 06:18:05 2017-06-14 06:18:01.698790 SGU2 YUS Normal 1.212729 South 131 Arriving
42 5168399 10 1 2017-06-14 06:19:05 2017-06-14 06:19:02.135651 SGU2 YUS Normal 0.614202 South 131 Arriving
45 5168467 10 1 2017-06-14 06:20:05 2017-06-14 06:20:02.325968 SGU2 YUS Normal 0.000000 South 131 AtStation
48 5168535 10 1 2017-06-14 06:21:08 2017-06-14 06:21:05.104026 SGU2 YUS Normal 3.957929 South 141 Arriving
51 5168603 10 1 2017-06-14 06:22:05 2017-06-14 06:22:01.576576 SGU2 YUS Normal 3.116040 South 141 Arriving
54 5168671 10 1 2017-06-14 06:23:05 2017-06-14 06:23:02.122965 SGU2 YUS Normal 2.534529 South 141 Arriving
57 5168739 10 1 2017-06-14 06:24:05 2017-06-14 06:24:01.856485 SGU2 YUS Normal 1.212729 South 141 Arriving
60 5168807 10 1 2017-06-14 06:25:05 2017-06-14 06:25:01.844635 SGU2 YUS Normal 0.637988 South 141 Arriving
63 5168875 10 1 2017-06-14 06:26:02 2017-06-14 06:26:02.081188 SGU2 YUS Normal 0.000000 South 141 AtStation
66 5168943 10 1 2017-06-14 06:27:06 2017-06-14 06:27:02.254455 SGU2 YUS Normal 2.534529 South 143 Arriving
69 5169011 10 1 2017-06-14 06:28:03 2017-06-14 06:28:01.867800 SGU2 YUS Normal 1.891656 South 143 Arriving
72 5169079 10 1 2017-06-14 06:29:06 2017-06-14 06:29:02.253810 SGU2 YUS Normal 0.957729 South 143 Arriving
75 5169147 10 1 2017-06-14 06:30:05 2017-06-14 06:30:02.382857 SGU2 YUS Normal 0.000000 South 143 AtStation
78 5169215 10 1 2017-06-14 06:31:09 2017-06-14 06:31:05.005267 SGU2 YUS Normal 4.512351 South 146 Arriving
81 5169283 10 1 2017-06-14 06:32:10 2017-06-14 06:32:06.909556 SGU2 YUS Normal 3.957929 South 146 Arriving
84 5169351 10 1 2017-06-14 06:33:05 2017-06-14 06:33:01.905890 SGU2 YUS Normal 3.474031 South 146 Arriving
87 5169419 10 1 2017-06-14 06:34:06 2017-06-14 06:34:02.278264 SGU2 YUS Normal 2.534529 South 146 Arriving
... ... ... ... ... ... ... ... ... ... ... ... ...
3495 5246663 10 1 2017-06-15 01:30:13 2017-06-15 01:30:01.828001 SGU2 YUS Normal 2.422056 South 144 Arriving
3498 5246731 10 1 2017-06-15 01:31:13 2017-06-15 01:31:02.178199 SGU2 YUS Normal 1.371189 South 144 Arriving
3501 5246799 10 1 2017-06-15 01:32:12 2017-06-15 01:32:02.046908 SGU2 YUS Normal 0.602544 South 144 Arriving
3504 5246867 10 1 2017-06-15 01:33:13 2017-06-15 01:33:02.294897 SGU2 YUS Normal 0.000000 South 144 AtStation
3507 5246935 10 1 2017-06-15 01:34:13 2017-06-15 01:34:01.938788 SGU2 YUS Normal 3.564362 South 145 Arriving
3510 5247003 10 1 2017-06-15 01:35:13 2017-06-15 01:35:02.411738 SGU2 YUS Normal 2.422056 South 145 Arriving
3513 5247071 10 1 2017-06-15 01:36:14 2017-06-15 01:36:02.403219 SGU2 YUS Normal 1.371189 South 145 Arriving
3516 5247139 10 1 2017-06-15 01:37:14 2017-06-15 01:37:02.001731 SGU2 YUS Normal 0.580080 South 145 Arriving
3519 5247207 10 1 2017-06-15 01:38:14 2017-06-15 01:38:02.308221 SGU2 YUS Normal 0.000000 South 145 AtStation
3522 5247275 10 1 2017-06-15 01:39:15 2017-06-15 01:39:04.755527 SGU2 YUS Normal 4.063331 South 148 Arriving
3525 5247343 10 1 2017-06-15 01:40:14 2017-06-15 01:40:02.072266 SGU2 YUS Normal 3.564362 South 148 Arriving
3528 5247411 10 1 2017-06-15 01:41:12 2017-06-15 01:41:02.430344 SGU2 YUS Normal 2.422056 South 148 Arriving
3531 5247479 10 1 2017-06-15 01:42:14 2017-06-15 01:42:01.965104 SGU2 YUS Normal 1.586242 South 148 Arriving
3534 5247547 10 1 2017-06-15 01:43:13 2017-06-15 01:43:01.768641 SGU2 YUS Normal 1.159522 South 148 Arriving
3537 5247615 10 1 2017-06-15 01:44:16 2017-06-15 01:44:05.130549 SGU2 YUS Normal 0.283633 South 148 Arriving
3540 5247683 10 1 2017-06-15 01:45:13 2017-06-15 01:45:01.639589 SGU2 YUS Normal 0.000000 South 148 AtStation
3543 5247751 10 1 2017-06-15 01:46:13 2017-06-15 01:46:02.034260 SGU2 YUS Normal 2.422056 South 151 Arriving
3546 5247819 10 1 2017-06-15 01:47:14 2017-06-15 01:47:02.358010 SGU2 YUS Normal 1.586242 South 151 Arriving
3549 5247887 10 1 2017-06-15 01:48:14 2017-06-15 01:48:02.263229 SGU2 YUS Normal 0.736318 South 151 Arriving
3552 5247955 10 1 2017-06-15 01:49:12 2017-06-15 01:49:02.243198 SGU2 YUS Normal 0.000000 South 151 AtStation
3555 5248023 10 1 2017-06-15 01:50:14 2017-06-15 01:50:02.422329 SGU2 YUS Normal 0.580080 South 838 Arriving
3558 5248091 10 1 2017-06-15 01:51:14 2017-06-15 01:51:02.415042 SGU2 YUS Normal 0.000000 South 838 AtStation
3561 5248159 10 1 2017-06-15 01:52:12 2017-06-15 01:52:02.404154 SGU2 YUS Normal 1.159522 South 152 Arriving
3564 5248227 10 1 2017-06-15 01:53:10 2017-06-15 01:53:01.534627 SGU2 YUS Normal 0.283633 South 152 Arriving
3567 5248295 10 1 2017-06-15 01:54:14 2017-06-15 01:54:02.234853 SGU2 YUS Normal 0.000000 South 152 AtStation
3570 5248363 10 1 2017-06-15 01:55:14 2017-06-15 01:55:01.750195 SGU2 YUS Normal 0.365478 South 809 Arriving
3573 5248431 10 1 2017-06-15 01:56:13 2017-06-15 01:56:02.426207 SGU2 YUS Normal 0.000000 South 809 AtStation
3576 5248499 10 1 2017-06-15 01:57:11 2017-06-15 01:57:01.466266 SGU2 YUS Normal 0.000000 South 809 AtStation
3579 5248567 10 1 2017-06-15 01:58:14 2017-06-15 01:58:02.208761 SGU2 YUS Normal 0.000000 South 809 Delayed
3582 5248635 10 1 2017-06-15 01:59:14 2017-06-15 01:59:01.926096 SGU2 YUS Normal 0.000000 South 809 Delayed

1195 rows × 12 columns

Extracting some useful information

Now we need to process the data to extract some useful information from the raw ntas_data. To do this we're going to go row by row through the table shown above to get arrival times, departure times and wait times.

arrival_times are the times at which a train arrives at St. George station
departure_times are the times at which a train leaves St. George station
all_wait_times are all the reported wait times from every API call (which in this case is every 10 seconds)
expected_wait_times are the expected wait times immediately after a train has departed the station. They represent the worst case wait times.


In [30]:
arrival_times = []
departure_times = []
all_wait_times = []
all_time_stamps = []
expected_wait_times = []
prev_arrival_train_id = -1

for index, row in stg_south_resamp.iterrows():
    if index == 0:
        prev_departure_train_id = row['trainid']
    all_wait_times.append(row['timint'])
    all_time_stamps.append(row['create_date'])
    if (row['trainid'] != prev_arrival_train_id):
        arrival_times.append(row['create_date'])
        prev_arrival_train_id = row['trainid']
    #elif (row['trainid'] != prev_departure_train_id):
        departure_times.append(row['create_date'])
        expected_wait_times.append(row['timint'])
        #prev_departure_train_id = row['trainid']

We can look at all the reported wait times. While this is somewhat interesting, it doesn't tell us very much


In [206]:
plt.plot(all_time_stamps,all_wait_times)
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('All reported wait times at St. George')
plt.savefig('all_wait_times.png', dpi=500)
plt.show()

In [286]:
def timeToArrival(all_time_stamps,all_wait_times,arrival_times):
    actual_wait_times = []
    i = 0
    k = 0
    arrival_time = arrival_times[i]
    for time in all_time_stamps:
        if (all_wait_times[k] == 0):
            actual_wait_times.append(arrival_times[0]-arrival_times[0])
            k+=1
            continue
        while ((arrival_time - time).total_seconds() < 0):
            i+=1
            if (i > (len(arrival_times) -1)):
                break
            arrival_time = arrival_times[i]
            
        actual_wait_times.append(arrival_time - time)
        k+=1
    return actual_wait_times

print(len(all_time_stamps[0:-1]))
actual_wait_times_all = timeToArrival(all_time_stamps,all_wait_times,arrival_times)


1194

In [295]:
def sliding_window_filter(input_mat,window_size,overlap):
    average_time = []
    max_time = []
    for i in range(0,len(input_mat)-window_size,overlap):
        window = input_mat[i:(i+window_size)]
        average_time.append(np.mean(window))
        max_time.append(np.mean(window))
    return average_time #, max_time

window_size = 30
overlap = 25
#average_time, max_time = sliding_window_filter(all_wait_times,window_size, overlap)
#times = all_time_stamps[0:len(all_time_stamps)-window_size:overlap]

#times = all_time_stamps[0:len(actual_wait_times_all)]
times = all_time_stamps[0:len(all_time_stamps)-window_size:overlap]
plt.plot(times,np.floor(sliding_window_filter(convert_timedelta_to_mins(actual_wait_times_all),window_size,overlap)))

#average_time, max_time = sliding_window_filter(convert_timedelta_to_mins(actual_wait_times_all),window_size, overlap)
plt.plot(times,np.ceil(sliding_window_filter(all_wait_times,window_size,overlap)))
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('All reported wait times at St. George')
plt.show()

In [289]:
class sliding_figure:
    import matplotlib
    import matplotlib.pyplot as plt
    from matplotlib.widgets import Slider
    
    def __init__(self,all_time_stamps,all_wait_times):
        self.fig, self.ax = plt.subplots()
        plt.subplots_adjust(bottom=0.25)
        
        self.t = all_time_stamps;
        self.s = all_wait_times;
        self.l, = plt.plot(self.t,self.s)

        self.y_min = 0.0;
        self.y_max = max(self.s)

        plt.axis([self.t[0], self.t[100], self.y_min, self.y_max])
        x_dt = self.t[100] - self.t[0]

        self.axcolor = 'lightgoldenrodyellow'
        self.axpos = plt.axes([0.2, 0.1, 0.65, 0.03], facecolor=axcolor)
        self.spos = Slider(self.axpos, 'Pos', matplotlib.dates.date2num(self.t[0]), matplotlib.dates.date2num(self.t[-1]))
        
        #self.showPlot()
        
        # pretty date names
        plt.gcf().autofmt_xdate()
        
        self.plt = plt
        #self.showPlot()
    
    def update(self,val):
        pos = self.spos.val
        self.xmax_time = matplotlib.dates.num2date(pos) + x_dt
        self.xmin_time = pos
        self.ax.axis([self.xmin_time, self.xmax_time, self.y_min, self.y_max])
        fig.canvas.draw_idle()
        
    def showPlot(self):
        self.spos.on_changed(self.update)
        self.plt.show()

In [296]:
wait_times_figure = sliding_figure(all_time_stamps, all_wait_times)
wait_times_figure.showPlot()

Headway analysis

By looking at the difference in arrival times at St. Geore we can determine the headway (aka. the time between trains) as the approach St. George station


In [84]:
def time_delta(times):
    delta_times = []
    for n in range(0,len(times)-1):
        time_diff = times[n+1] - times[n]
        delta_times.append(time_diff/np.timedelta64(1, 's'))
    return delta_times

In [85]:
delta_times = time_delta(arrival_times)

In [86]:
#delta_times

In [162]:
plt.plot(arrival_times[:-1],np.multiply(delta_times,1/60.0))
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Headway (mins)')
plt.title('Headway between trains as they approach St. George')
plt.savefig('headway.png', dpi=500)


Analyzing time spent at the station

We can also look at how long trains spend at the station by looking at the difference between the departure and arrival times. St. George station is an interchange station, as such, trains do tend to spend longer here than at intermediary station.


In [88]:
time_at_station = np.subtract(departure_times[:],arrival_times[:])

In [89]:
#time_at_station

In [133]:
def convert_timedelta_to_mins(mat):
    result = []
    for element in mat:
        result.append((element/np.timedelta64(1, 'm')))
    return result

In [91]:
time_at_station_mins = convert_timedelta_to_mins(time_at_station)

In [92]:
plt.plot(departure_times,time_at_station_mins)
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Duration of time at station (mins)')
plt.title('Duration of time that trains spend at St. George Station')


Out[92]:
Text(0.5,1,'Duration of time that trains spend at St. George Station')

Expected wait times

The expected wait times represent the worst-case wait reported wait time immediately after the previous train has left the station


In [93]:
#expected_wait_times

In [165]:
plt.plot(arrival_times,expected_wait_times)
plt.ylabel('Expected Wait Time (mins)')
plt.xticks(fontsize=10, rotation=45)
plt.xlabel('Time')
plt.title('Worst-case expected wait times for next train at St. George')
plt.savefig('expected_wait_times.png', dpi=500)



In [ ]:

Actual wait time

It's instructive if we can look at the actual worst-case wait time and compare this to the expected worst case wait time. In this case, we will also consider the actual worst-case wait time as the time between when a train departs and the next train arrives (i.e the difference between the arrival time and the previous departed time)


In [142]:
actual_wait_times = np.subtract(arrival_times[1:],arrival_times[:-1])

In [143]:
actual_wait_times_mins = convert_timedelta_to_mins(actual_wait_times)

In [166]:
plt.plot(arrival_times[1:],actual_wait_times_mins,color = 'C1')
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Actual wait time (mins)')
plt.title('Worst-case actual wait times for next train at St. George')
plt.savefig('actual_wait_times.png', dpi=500)
plt.show()
print(len(actual_wait_times_mins))


346

In [177]:
window_size = 15
overlap = 14
average_time, max_time = sliding_window_filter(actual_wait_times_mins ,window_size, overlap)
print(len(average_time))
times = arrival_times[0:len(all_time_stamps)-window_size:overlap]
print(len(times))
plt.plot(times[1:],average_time)
plt.plot(times[1:],max_time)
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('All reported wait times at St. George')
plt.show()


24
25

Comparing actual and expected wait times

Now let's put everything together and compare the actual and expected wait times.


In [139]:
print(len(expected_wait_times))
print(len(arrival_times))
print(len(arrival_times))
print(len(actual_wait_times_mins))
type(arrival_times[1])

arrival_times_pdt = []
for item in arrival_times:
    arrival_times_pdt.append(datetime.time(item.to_pydatetime().hour,item.to_pydatetime().minute))
arrival_times_pdt[2]


335
335
335
334
Out[139]:
datetime.time(7, 10)

In [188]:
plt.plot(arrival_times,expected_wait_times)
plt.plot(arrival_times[1:],np.floor(actual_wait_times_mins[:]))
#plt.legend(['Expected Wait for Next Train','Actual Wait Time for Next Train'],
#           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('Comparing actual and expected wait times at St. George')
lgd = plt.legend(['Expected Wait for Next Train','Actual Wait Time for Next Train'],
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.savefig('actual_and_expected_wait_times.png', bbox_extra_artists=(lgd,), bbox_inches='tight', dpi=700)

In [285]:
window_size = 15
overlap = 12
average_time = sliding_window_filter(actual_wait_times_mins,window_size, overlap)
print(len(average_time))
times = arrival_times[0:len(all_time_stamps)-window_size:overlap]
print(len(times))
plt.plot(times[1:],np.floor(average_time))

average_time = sliding_window_filter(np.ceil(expected_wait_times),window_size, overlap)
plt.plot(times[1:],np.floor(average_time))
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('All reported wait times at St. George')
plt.show()
lgd = plt.legend(['Actual Wait for Next Train','Expected Wait Time for Next Train'])


28
29

We can also plot all the reported wait times too!


In [170]:
plt.plot(departure_times,expected_wait_times)
plt.plot(departure_times[1:],actual_wait_times_mins)
plt.plot(all_time_stamps,all_wait_times)
plt.legend(['Expected Wait for Next Train','Actual Wait Time for Next Train','All Reported Wait Times'],
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=45)
plt.ylabel('Wait Time (mins)')
plt.title('Comparing actual and expected wait times at St. George')


Out[170]:
Text(0.5,1,'Comparing actual and expected wait times at St. George')

We can also look at how long trains spend at St. George


In [24]:
plt.plot(all_time_stamps,all_wait_times)
plt.plot(arrival_times[:],time_at_station_mins)
plt.title('Durtion of time trains spend at St.George')
plt.xlabel('Time')
plt.xticks(fontsize=10, rotation=90)
plt.ylabel('Time (mins)')
plt.legend(['All Reported Wait Times','Time train spends at station (mins)'],
           bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)


Out[24]:
<matplotlib.legend.Legend at 0x7fcaf99a1400>

In [ ]: