Exploration of traffic data from lauttasaaren silta


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

First download the dataset and place it in the data/ folder: https://www.avoindata.fi/data/fi/dataset/liikennemaarat-helsingissa

The data is obtained from an automatic meassurement point at: https://www.google.fi/maps/place/60%C2%B009'44.6%22N+24%C2%B053'58.6%22E/@60.1623877,24.8974263,17z/data=!3m1!4b1!4m5!3m4!1s0x0:0x0!8m2!3d60.162385!4d24.899615?hl=en

Legend(in Finnish): ha=henkilöautot pa=pakettiautot ka=kuorma-autot ra=rekka-autot la=linja-autot mp=moottoripyörät rv=raitiovaunut

Laskenta on tehty tunneittain, paitsi ruuhka-aikoina (klo 6.00-9.00 ja 15.00-18.00) jaksotus on 15 minuuttia. Kellonajat ovat alkavia kellonaikoja. Laskentapisteiden poikkileikkaukset lasketaan suunnittain (suunta 1 on keskustaan, linjalla D D1-D13 länteen)


In [2]:
data = pd.read_csv('data/hki_liikennemaarat.csv', encoding='latin-1',delimiter=';')
data.head()


Out[2]:
piste nimi x_gk25 y_gk25 suunta aika vuosi ha pa ka ra la mp rv autot
0 A01 LAUTTASAAREN SILTA 25494426 6672169 1.0 0 2011 76 5 1 0 5 0 0 87
1 A01 LAUTTASAAREN SILTA 25494426 6672169 1.0 100 2011 65 5 1 0 4 0 0 75
2 A01 LAUTTASAAREN SILTA 25494426 6672169 1.0 200 2011 61 4 1 0 4 0 0 70
3 A01 LAUTTASAAREN SILTA 25494426 6672169 1.0 300 2011 52 4 1 0 3 0 0 60
4 A01 LAUTTASAAREN SILTA 25494426 6672169 1.0 400 2011 31 2 0 0 2 0 0 35

The columns are: ha=henkilöautot pa=pakettiautot ka=kuorma-autot ra=rekka-autot la=linja-autot mp=moottoripyörät rv=raitiovaunut

Remove data about trams and other measurement points.


In [3]:
laru = data[data.nimi == 'LAUTTASAAREN SILTA']
laru = laru.loc[:,['suunta','aika','vuosi','autot','ha','pa','ka','ra','la','mp']]

Since the time series is at uneven intervals some reductions have to be made.


In [4]:
laru['tunti'] = (laru['aika'] / 100).apply(np.floor)
laru


Out[4]:
suunta aika vuosi autot ha pa ka ra la mp tunti
0 1.0 0 2011 87 76 5 1 0 5 0 0.0
1 1.0 100 2011 75 65 5 1 0 4 0 1.0
2 1.0 200 2011 70 61 4 1 0 4 0 2.0
3 1.0 300 2011 60 52 4 1 0 3 0 3.0
4 1.0 400 2011 35 31 2 0 0 2 0 4.0
5 1.0 500 2011 87 76 5 1 0 5 0 5.0
6 1.0 600 2011 54 42 4 1 0 7 0 6.0
7 1.0 615 2011 60 45 4 3 0 8 0 6.0
8 1.0 630 2011 77 60 6 2 0 9 1 6.0
9 1.0 645 2011 96 77 6 3 0 10 1 6.0
10 1.0 700 2011 61 40 12 0 0 9 0 7.0
11 1.0 715 2011 81 55 14 4 0 8 0 7.0
12 1.0 730 2011 148 84 46 6 0 12 6 7.0
13 1.0 745 2011 115 90 12 0 0 13 4 7.0
14 1.0 800 2011 132 105 15 0 0 12 0 8.0
15 1.0 815 2011 78 66 0 3 0 9 0 8.0
16 1.0 830 2011 124 97 6 12 0 9 0 8.0
17 1.0 845 2011 150 113 22 0 0 15 3 8.0
18 1.0 900 2011 422 326 47 12 0 37 6 9.0
19 1.0 1000 2011 389 260 93 24 0 12 6 10.0
20 1.0 1100 2011 399 316 58 13 0 12 15 11.0
21 1.0 1200 2011 469 353 69 25 1 21 0 12.0
22 1.0 1300 2011 495 388 64 20 0 23 4 13.0
23 1.0 1400 2011 497 385 71 12 0 29 9 14.0
24 1.0 1500 2011 115 96 10 3 0 6 2 15.0
25 1.0 1515 2011 134 109 11 3 0 11 5 15.0
26 1.0 1530 2011 144 120 14 0 0 10 2 15.0
27 1.0 1545 2011 126 105 13 2 0 6 2 15.0
28 1.0 1600 2011 145 119 15 1 0 10 12 16.0
29 1.0 1615 2011 147 130 9 2 0 6 4 16.0
... ... ... ... ... ... ... ... ... ... ... ...
23994 2.0 730 2016 77 56 10 3 1 7 3 7.0
23995 2.0 745 2016 101 75 18 1 0 7 0 7.0
23996 2.0 800 2016 111 82 14 2 1 12 2 8.0
23997 2.0 815 2016 117 83 22 4 0 8 1 8.0
23998 2.0 830 2016 108 85 11 3 0 9 1 8.0
23999 2.0 845 2016 102 77 15 3 1 6 2 8.0
24000 2.0 900 2016 420 318 56 15 1 30 6 9.0
24001 2.0 1000 2016 392 291 63 13 0 25 5 10.0
24002 2.0 1100 2016 517 386 79 24 1 27 5 11.0
24003 2.0 1200 2016 482 377 64 15 0 26 3 12.0
24004 2.0 1300 2016 451 352 61 15 0 23 9 13.0
24005 2.0 1400 2016 501 404 60 11 0 26 12 14.0
24006 2.0 1500 2016 168 133 19 7 0 9 2 15.0
24007 2.0 1515 2016 127 103 19 0 0 5 3 15.0
24008 2.0 1530 2016 149 124 15 2 0 8 4 15.0
24009 2.0 1545 2016 160 133 15 4 0 8 2 15.0
24010 2.0 1600 2016 151 125 16 2 0 8 3 16.0
24011 2.0 1615 2016 157 139 9 2 0 7 3 16.0
24012 2.0 1630 2016 176 159 7 0 0 10 6 16.0
24013 2.0 1645 2016 180 156 16 0 0 8 1 16.0
24014 2.0 1700 2016 149 127 12 2 0 8 7 17.0
24015 2.0 1715 2016 176 163 4 0 0 9 5 17.0
24016 2.0 1730 2016 181 161 9 0 0 11 5 17.0
24017 2.0 1745 2016 152 133 13 0 0 6 2 17.0
24018 2.0 1800 2016 485 420 34 1 0 30 20 18.0
24019 2.0 1900 2016 507 444 30 6 1 26 5 19.0
24020 2.0 2000 2016 425 368 29 6 0 22 0 20.0
24021 2.0 2100 2016 319 277 21 4 0 17 0 21.0
24022 2.0 2200 2016 168 145 12 2 0 9 0 22.0
24023 2.0 2300 2016 110 95 8 1 0 6 0 23.0

504 rows × 11 columns

Lets define light as: ha=henkilöautot pa=pakettiautot mp=moottoripyörät Heavy as: ka=kuorma-autot ra=rekka-autot Buses as: la=linja-autot


In [5]:
laru['light traffic'] = laru['ha'] + laru['pa'] + laru['mp']
laru['heavy traffic'] = laru['ka'] + laru['ra'] + laru['la']
laru['buses'] = laru['la']

In [6]:
laru.head()


Out[6]:
suunta aika vuosi autot ha pa ka ra la mp tunti light traffic heavy traffic buses
0 1.0 0 2011 87 76 5 1 0 5 0 0.0 81 6 5
1 1.0 100 2011 75 65 5 1 0 4 0 1.0 70 5 4
2 1.0 200 2011 70 61 4 1 0 4 0 2.0 65 5 4
3 1.0 300 2011 60 52 4 1 0 3 0 3.0 56 4 3
4 1.0 400 2011 35 31 2 0 0 2 0 4.0 33 2 2

In [7]:
laru = laru.drop(['autot','ha','pa','ka','ra','la','mp'],axis=1)
laru.head()


Out[7]:
suunta aika vuosi tunti light traffic heavy traffic buses
0 1.0 0 2011 0.0 81 6 5
1 1.0 100 2011 1.0 70 5 4
2 1.0 200 2011 2.0 65 5 4
3 1.0 300 2011 3.0 56 4 3
4 1.0 400 2011 4.0 33 2 2

In [8]:
yrl = laru.groupby(['vuosi','suunta','tunti']).sum()
yrl.reset_index(inplace=True)
yrl


Out[8]:
vuosi suunta tunti aika light traffic heavy traffic buses
0 2011 1.0 0.0 0 81 6 5
1 2011 1.0 1.0 100 70 5 4
2 2011 1.0 2.0 200 65 5 4
3 2011 1.0 3.0 300 56 4 3
4 2011 1.0 4.0 400 33 2 2
5 2011 1.0 5.0 500 81 6 5
6 2011 1.0 6.0 2490 246 43 34
7 2011 1.0 7.0 2890 363 52 42
8 2011 1.0 8.0 3290 427 60 45
9 2011 1.0 9.0 900 379 49 37
10 2011 1.0 10.0 1000 359 36 12
11 2011 1.0 11.0 1100 389 25 12
12 2011 1.0 12.0 1200 422 47 21
13 2011 1.0 13.0 1300 456 43 23
14 2011 1.0 14.0 1400 465 41 29
15 2011 1.0 15.0 6090 489 41 33
16 2011 1.0 16.0 6490 532 36 33
17 2011 1.0 17.0 6890 399 28 26
18 2011 1.0 18.0 1800 369 28 26
19 2011 1.0 19.0 1900 449 33 28
20 2011 1.0 20.0 2000 355 26 21
21 2011 1.0 21.0 2100 286 20 16
22 2011 1.0 22.0 2200 182 13 10
23 2011 1.0 23.0 2300 139 10 8
24 2011 2.0 0.0 0 102 7 6
25 2011 2.0 1.0 100 59 4 3
26 2011 2.0 2.0 200 72 5 4
27 2011 2.0 3.0 300 64 5 4
28 2011 2.0 4.0 400 26 1 1
29 2011 2.0 5.0 500 60 4 3
... ... ... ... ... ... ... ...
258 2016 1.0 18.0 1800 430 26 23
259 2016 1.0 19.0 1900 367 27 23
260 2016 1.0 20.0 2000 309 22 18
261 2016 1.0 21.0 2100 213 15 12
262 2016 1.0 22.0 2200 119 9 7
263 2016 1.0 23.0 2300 88 6 5
264 2016 2.0 0.0 0 46 3 3
265 2016 2.0 1.0 100 35 3 2
266 2016 2.0 2.0 200 20 1 1
267 2016 2.0 3.0 300 12 1 1
268 2016 2.0 4.0 400 21 1 1
269 2016 2.0 5.0 500 35 3 2
270 2016 2.0 6.0 2490 117 47 40
271 2016 2.0 7.0 2890 284 39 27
272 2016 2.0 8.0 3290 395 49 35
273 2016 2.0 9.0 900 380 46 30
274 2016 2.0 10.0 1000 359 38 25
275 2016 2.0 11.0 1100 470 52 27
276 2016 2.0 12.0 1200 444 41 26
277 2016 2.0 13.0 1300 422 38 23
278 2016 2.0 14.0 1400 476 37 26
279 2016 2.0 15.0 6090 572 43 30
280 2016 2.0 16.0 6490 640 37 33
281 2016 2.0 17.0 6890 641 36 34
282 2016 2.0 18.0 1800 474 31 30
283 2016 2.0 19.0 1900 479 33 26
284 2016 2.0 20.0 2000 397 28 22
285 2016 2.0 21.0 2100 298 21 17
286 2016 2.0 22.0 2200 157 11 9
287 2016 2.0 23.0 2300 103 7 6

288 rows × 7 columns


In [9]:
y_2016_tohel = yrl[(yrl.vuosi == 2016) & (yrl.suunta == 1.0) ]
y_2016_tolaru = yrl[(yrl.vuosi == 2016) & (yrl.suunta == 2.0) ]
y_2016_tohel = y_2016_tohel.drop(['aika','suunta','vuosi'],axis = 1)
y_2016_tolaru = y_2016_tolaru.drop(['aika','suunta','vuosi'],axis = 1)

In [10]:
y_2016_tohel.head()


Out[10]:
tunti light traffic heavy traffic buses
240 0.0 33 2 2
241 1.0 24 1 1
242 2.0 11 1 1
243 3.0 22 1 1
244 4.0 23 1 1

In [11]:
y_2016_tolaru.head()


Out[11]:
tunti light traffic heavy traffic buses
264 0.0 46 3 3
265 1.0 35 3 2
266 2.0 20 1 1
267 3.0 12 1 1
268 4.0 21 1 1

In [12]:
y_2016_tohel['time'] = y_2016_tohel['tunti'].apply(lambda x: pd.to_timedelta(x, unit='h'))
y_2016_tolaru['time'] = y_2016_tolaru['tunti'].apply(lambda x: pd.to_timedelta(x, unit='h'))
y_2016_tolaru = y_2016_tolaru.set_index('time')
y_2016_tohel = y_2016_tohel.set_index('time')
y_2016_tohel


Out[12]:
tunti light traffic heavy traffic buses
time
00:00:00 0.0 33 2 2
01:00:00 1.0 24 1 1
02:00:00 2.0 11 1 1
03:00:00 3.0 22 1 1
04:00:00 4.0 23 1 1
05:00:00 5.0 64 5 4
06:00:00 6.0 179 35 27
07:00:00 7.0 515 57 40
08:00:00 8.0 824 56 37
09:00:00 9.0 593 51 28
10:00:00 10.0 431 44 27
11:00:00 11.0 458 38 25
12:00:00 12.0 457 38 23
13:00:00 13.0 432 33 21
14:00:00 14.0 489 31 24
15:00:00 15.0 578 45 35
16:00:00 16.0 578 36 30
17:00:00 17.0 493 32 29
18:00:00 18.0 430 26 23
19:00:00 19.0 367 27 23
20:00:00 20.0 309 22 18
21:00:00 21.0 213 15 12
22:00:00 22.0 119 9 7
23:00:00 23.0 88 6 5

In [13]:
y_2016_tohel = y_2016_tohel.drop(['tunti'], axis = 1)
y_2016_tolaru = y_2016_tolaru.drop(['tunti'], axis = 1)
y_2016_tolaru


Out[13]:
light traffic heavy traffic buses
time
00:00:00 46 3 3
01:00:00 35 3 2
02:00:00 20 1 1
03:00:00 12 1 1
04:00:00 21 1 1
05:00:00 35 3 2
06:00:00 117 47 40
07:00:00 284 39 27
08:00:00 395 49 35
09:00:00 380 46 30
10:00:00 359 38 25
11:00:00 470 52 27
12:00:00 444 41 26
13:00:00 422 38 23
14:00:00 476 37 26
15:00:00 572 43 30
16:00:00 640 37 33
17:00:00 641 36 34
18:00:00 474 31 30
19:00:00 479 33 26
20:00:00 397 28 22
21:00:00 298 21 17
22:00:00 157 11 9
23:00:00 103 7 6

In [14]:
xinterval = pd.date_range('1/1/2011', periods=24, freq='H').time

In [15]:
plt.figure(figsize=(23,13))
plt.xticks(xinterval,rotation=45)
plt.grid(True)
plt.title('Traffic towards Lauttasaari. Representative sample from 2016, 1 hour sample interval')
plt.plot(xinterval,y_2016_tolaru['light traffic']);
plt.plot(xinterval,y_2016_tolaru['heavy traffic']);
plt.plot(xinterval,y_2016_tolaru['buses']);
plt.legend(['light traffic','heavy traffic','buses']);
plt.xlabel('Time');
plt.ylabel('Vehicles / hour');



In [16]:
plt.figure(figsize=(23,13))
plt.xticks(xinterval,rotation=45)
plt.grid(True)
plt.title('Traffic towards Helsinki. Representative sample from 2016, 1 hour sample interval')
plt.plot(xinterval,y_2016_tohel['light traffic']);
plt.plot(xinterval,y_2016_tohel['heavy traffic']);
plt.plot(xinterval,y_2016_tohel['buses']);
plt.legend(['light traffic','heavy traffic','buses']);
plt.xlabel('Time');
plt.ylabel('Vehicles / hour');



In [17]:
y_2016_tohel.to_csv('data/dist_larubridge_tohel_16.csv')
y_2016_tolaru.to_csv('data/dist_larubridge_tolaru_16.csv')