Bike Availability Preprocessing

Data Dictionary

The raw data contains the following data per station per reading:

Id - String - API Resource Id
Name - String - The common name of the station
PlaceType - String ?
TerminalName - String - ?
NbBikes - Integer - The number of available bikes
NbDocks - Integer - The total number of docking spaces
NbEmptyDocks - Integer - The number of available empty docking spaces
Timestamp - DateTime - The moment this reading was captured
InstallDate - DateTime - Date when the station was installed
RemovalDate - DateTime - Date when the station was removed
Installed - Boolean - If the station is installed or not
Locked - Boolean - ?
Temporary - Boolean - If the station is temporary or not (TfL adds temporary stations to cope with demand.)
Latitude - Float - Latitude Coordinate
Longitude - Float - Longitude Coordinate

The following variables will be derived from the raw data.

NbUnusableDocks - Integer - The number of non-working docking spaces. Computed with NbUnusableDocks = NbDocks - (NbBikes + NbEmptyDocks)

Set up

Imports



In [1]:

    
%matplotlib inline

import logging
import itertools
import json
import os
import pickle
import folium
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from mpl_toolkits.basemap import Basemap
from datetime import datetime
from os import listdir
from os.path import isfile, join
from IPython.display import Image
from datetime import date

from src.data.parse_dataset import parse_dir, parse_json_files, get_file_list
from src.data.string_format import format_name, to_short_name
from src.data.visualization import lon_min_longitude, lon_min_latitude, lon_max_longitude, lon_max_latitude, lon_center_latitude, lon_center_longitude, create_london_map

logger = logging.getLogger()
logger.setLevel(logging.INFO)

Parse Raw Data

Define the Parsing Functions



In [2]:

    
def parse_cycles(json_obj):
    """Parses TfL's BikePoint JSON response"""

    return [parse_station(element) for element in json_obj]

def parse_station(element):
    """Parses a JSON bicycle station object to a dictionary"""

    obj = {
        'Id': element['id'],
        'Name': element['commonName'],
        'Latitude': element['lat'],
        'Longitude': element['lon'],
        'PlaceType': element['placeType'],
    }

    for p in element['additionalProperties']:
        obj[p['key']] = p['value']

        if 'timestamp' not in obj:
            obj['Timestamp'] = p['modified']
        elif obj['Timestamp'] != p['modified']:
            raise ValueError('The properties\' timestamps for station %s do not match: %s != %s' % (
            obj['id'], obj['Timestamp'], p['modified']))

    return obj



In [3]:

    
def bike_file_date_fn(file_name):
    """Gets the file's date"""

    return datetime.strptime(os.path.basename(file_name), 'BIKE-%Y-%m-%d:%H:%M:%S.json')

def create_between_dates_filter(file_date_fn, date_start, date_end):
    def filter_fn(file_name):
        file_date = file_date_fn(file_name)
        return file_date >= date_start and file_date <= date_end
    
    return filter_fn

Quick Data View

Load Single Day Data



In [4]:

    
filter_fn = create_between_dates_filter(bike_file_date_fn, 
                                       datetime(2016, 5, 16, 7, 0, 0),
                                       datetime(2016, 5, 16, 23, 59, 59))

records = parse_dir('/home/jfconavarrete/Documents/Work/Dissertation/spts-uoe/data/raw/cycles', 
                    parse_cycles, sort_fn=bike_file_date_fn, filter_fn=filter_fn)

# records is a list of lists of dicts
df = pd.DataFrame(list(itertools.chain.from_iterable(records)))

All Station View



In [5]:

    
df.head()









    Out[5]:






  
    
      
      Id
      InstallDate
      Installed
      Latitude
      Locked
      Longitude
      Name
      NbBikes
      NbDocks
      NbEmptyDocks
      PlaceType
      RemovalDate
      Temporary
      TerminalName
      Timestamp
    
  
  
    
      0
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.109970
      River Street , Clerkenwell
      11
      19
      7
      BikePoint
      
      false
      001023
      2016-05-16T06:26:24.037
    
    
      1
      BikePoints_2
      1278585780000
      true
      51.499606
      false
      -0.197574
      Phillimore Gardens, Kensington
      12
      37
      25
      BikePoint
      
      false
      001018
      2016-05-16T06:26:24.037
    
    
      2
      BikePoints_3
      1278240360000
      true
      51.521283
      false
      -0.084605
      Christopher Street, Liverpool Street
      6
      32
      26
      BikePoint
      
      false
      001012
      2016-05-16T06:51:27.5
    
    
      3
      BikePoints_4
      1278241080000
      true
      51.530059
      false
      -0.120973
      St. Chad's Street, King's Cross
      14
      23
      9
      BikePoint
      
      false
      001013
      2016-05-16T06:51:27.5
    
    
      4
      BikePoints_5
      1278241440000
      true
      51.493130
      false
      -0.156876
      Sedding Street, Sloane Square
      27
      27
      0
      BikePoint
      
      false
      003420
      2016-05-16T06:46:27.237

Single Station View



In [6]:

    
df[df['Id'] == 'BikePoints_1'].head()









    Out[6]:






  
    
      
      Id
      InstallDate
      Installed
      Latitude
      Locked
      Longitude
      Name
      NbBikes
      NbDocks
      NbEmptyDocks
      PlaceType
      RemovalDate
      Temporary
      TerminalName
      Timestamp
    
  
  
    
      0
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.10997
      River Street , Clerkenwell
      11
      19
      7
      BikePoint
      
      false
      001023
      2016-05-16T06:26:24.037
    
    
      762
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.10997
      River Street , Clerkenwell
      11
      19
      7
      BikePoint
      
      false
      001023
      2016-05-16T06:26:24.037
    
    
      1524
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.10997
      River Street , Clerkenwell
      10
      19
      8
      BikePoint
      
      false
      001023
      2016-05-16T07:01:29.163
    
    
      2286
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.10997
      River Street , Clerkenwell
      8
      19
      10
      BikePoint
      
      false
      001023
      2016-05-16T07:11:30.433
    
    
      3048
      BikePoints_1
      1278947280000
      true
      51.529163
      false
      -0.10997
      River Street , Clerkenwell
      8
      19
      10
      BikePoint
      
      false
      001023
      2016-05-16T07:11:30.433

Observations

There are some duplicate rows <- remove duplicates
RemovalDate may contain a lot of nulls <- remove if not helpful
Locked and Installed might be constant <- remove if not helpful

Build Dataset

Work with Chunks

Due to memory constraints we'll parse the data in chunks. In each chunk we'll remove the redundant candidate keys and also duplicate rows.



In [7]:

    
def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

Tables

We will have two different tables, one for the stations and one for the availability readings



In [8]:

    
def split_data(parsed_data):
    master_df = pd.DataFrame(list(itertools.chain.from_iterable(parsed_data)))
    
    readings_df = pd.DataFrame(master_df, columns=['Id', 'Timestamp', 'NbBikes', 'NbDocks', 'NbEmptyDocks'])
    stations_df = pd.DataFrame(master_df, columns=['Id', 'Name', 'TerminalName' , 'PlaceType', 'Latitude', 
                                                   'Longitude', 'Installed', 'Temporary', 'Locked',
                                                   'RemovalDate', 'InstallDate'])
    
    return (readings_df, stations_df)

Build the Dataset



In [ ]:

    
# get the files to parse
five_weekdays_filter = create_between_dates_filter(bike_file_date_fn, 
                                                   datetime(2016, 6, 19, 0, 0, 0), 
                                                   datetime(2016, 6, 27, 23, 59, 59))

files = get_file_list('data/raw/cycles', filter_fn=None, sort_fn=bike_file_date_fn)

# process the files in chunks
files_batches = chunker(files, 500)



In [ ]:

    
# start with an empty dataset
readings_dataset = pd.DataFrame()
stations_dataset = pd.DataFrame()

# append each chunk to the datasets while removing duplicates
for batch in files_batches:
    parsed_data = parse_json_files(batch, parse_cycles)
    
    # split the data into two station data and readings data
    readings_df, stations_df = split_data(parsed_data)
    
    # append the datasets
    readings_dataset = pd.concat([readings_dataset, readings_df])
    stations_dataset = pd.concat([stations_dataset, stations_df])
    
    # remove duplicated rows
    readings_dataset.drop_duplicates(inplace=True)
    stations_dataset.drop_duplicates(inplace=True)



In [ ]:

    
# put the parsed data in pickle files
pickle.dump(readings_dataset, open("data/parsed/readings_dataset_raw.p", "wb"))
pickle.dump(stations_dataset, open("data/parsed/stations_dataset_raw.p", "wb"))

Read the Parsed Data



In [9]:

    
stations_dataset = pickle.load(open('data/parsed/stations_dataset_raw.p', 'rb'))
readings_dataset = pickle.load(open('data/parsed/readings_dataset_raw.p', 'rb'))

Technically Correct Data

The data is set to be technically correct if it:

can be directly recognized as belonging to a certain variable
is stored in a data type that represents the value domain of the real-world variable.



In [10]:

    
# convert columns to their appropriate datatypes
stations_dataset['InstallDate'] = pd.to_numeric(stations_dataset['InstallDate'], errors='raise')
stations_dataset['RemovalDate'] = pd.to_numeric(stations_dataset['RemovalDate'], errors='raise')

stations_dataset['Installed'].replace({'true': True, 'false': False}, inplace=True)
stations_dataset['Temporary'].replace({'true': True, 'false': False}, inplace=True)
stations_dataset['Locked'].replace({'true': True, 'false': False}, inplace=True)

readings_dataset['NbBikes'] = readings_dataset['NbBikes'].astype('uint16')
readings_dataset['NbDocks'] = readings_dataset['NbDocks'].astype('uint16')
readings_dataset['NbEmptyDocks'] = readings_dataset['NbEmptyDocks'].astype('uint16')



In [11]:

    
# format station name
stations_dataset['Name'] = stations_dataset['Name'].apply(format_name)



In [12]:

    
# convert string timestamp to datetime
stations_dataset['InstallDate'] = pd.to_datetime(stations_dataset['InstallDate'], unit='ms', errors='raise')
stations_dataset['RemovalDate'] = pd.to_datetime(stations_dataset['RemovalDate'], unit='ms', errors='raise')

readings_dataset['Timestamp'] =  pd.to_datetime(readings_dataset['Timestamp'], format='%Y-%m-%dT%H:%M:%S.%f', errors='raise').dt.tz_localize('UTC')



In [13]:

    
# sort the datasets
stations_dataset.sort_values(by=['Id'], ascending=True, inplace=True)

readings_dataset.sort_values(by=['Timestamp'], ascending=True, inplace=True)

Derive Data



In [14]:

    
stations_dataset['ShortName'] = stations_dataset['Name'].apply(to_short_name)

readings_dataset['NbUnusableDocks'] = readings_dataset['NbDocks'] - (readings_dataset['NbBikes'] + readings_dataset['NbEmptyDocks'])

Add Station Priority Column

Priorities downloaded from https://www.whatdotheyknow.com/request/tfl_boris_bike_statistics?unfold=1



In [15]:

    
stations_priorities = pd.read_csv('data/raw/priorities/station_priorities.csv', encoding='latin-1')
stations_priorities['Site'] = stations_priorities['Site'].apply(format_name)



In [16]:

    
stations_dataset = pd.merge(stations_dataset, stations_priorities, how='left', left_on='ShortName', right_on='Site')
stations_dataset['Priority'].replace({'One': '1', 'Two': '2', 'Long Term Suspended': np.NaN, 'Long term suspension': np.NaN}, inplace=True)
stations_dataset.drop(['Site'], axis=1, inplace=True)
stations_dataset.drop(['Borough'], axis=1, inplace=True)



In [17]:

    
stations_dataset









    Out[17]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      0
      BikePoints_1
      River Street, Clerkenwell
      001023
      BikePoint
      51.529163
      -0.109970
      True
      False
      False
      NaT
      2010-07-12 15:08:00
      River Street
      2
    
    
      1
      BikePoints_10
      Park Street, Bankside
      001024
      BikePoint
      51.505974
      -0.092754
      True
      False
      False
      NaT
      2010-07-04 11:21:00
      Park Street
      2
    
    
      2
      BikePoints_100
      Albert Embankment, Vauxhall
      001059
      BikePoint
      51.490435
      -0.122806
      True
      False
      False
      NaT
      2010-07-14 09:31:00
      Albert Embankment
      2
    
    
      3
      BikePoints_101
      Queen Street 1, Bank
      000999
      BikePoint
      51.511553
      -0.092940
      True
      False
      False
      NaT
      2010-07-14 10:18:00
      Queen Street
      1
    
    
      4
      BikePoints_102
      Jewry Street, Aldgate
      001045
      BikePoint
      51.513406
      -0.076793
      True
      False
      False
      NaT
      2010-07-14 10:21:00
      Jewry Street
      2
    
    
      5
      BikePoints_103
      Vicarage Gate, Kensington
      003441
      BikePoint
      51.504723
      -0.192538
      True
      False
      False
      NaT
      2010-07-14 10:32:00
      Vicarage Gate
      2
    
    
      6
      BikePoints_104
      Crosswall, Tower
      000991
      BikePoint
      51.511594
      -0.077121
      True
      False
      False
      NaT
      2010-07-14 10:36:00
      Crosswall
      1
    
    
      7
      BikePoints_105
      Westbourne Grove, Bayswater
      001041
      BikePoint
      51.515529
      -0.190240
      True
      False
      False
      NaT
      2010-07-14 11:02:00
      Westbourne Grove
      2
    
    
      8
      BikePoints_106
      Woodstock Street, Mayfair
      001042
      BikePoint
      51.514105
      -0.147301
      True
      False
      False
      NaT
      2010-07-14 11:28:00
      Woodstock Street
      2
    
    
      9
      BikePoints_107
      Finsbury Leisure Centre, St. Lukes
      001049
      BikePoint
      51.526008
      -0.096317
      True
      False
      False
      NaT
      2010-07-14 11:38:00
      Finsbury Leisure Centre
      2
    
    
      10
      BikePoints_108
      Abbey Orchard Street, Westminster
      003429
      BikePoint
      51.498125
      -0.132102
      True
      False
      False
      NaT
      2010-07-14 11:42:00
      Abbey Orchard Street
      1
    
    
      11
      BikePoints_109
      Soho Square, Soho
      001052
      BikePoint
      51.515631
      -0.132328
      True
      False
      False
      NaT
      2010-07-14 11:52:00
      Soho Square
      1
    
    
      12
      BikePoints_11
      Brunswick Square, Bloomsbury
      001022
      BikePoint
      51.523951
      -0.122502
      True
      False
      False
      NaT
      2010-07-05 14:34:00
      Brunswick Square
      2
    
    
      13
      BikePoints_110
      Wellington Road, St. Johns Wood
      001055
      BikePoint
      51.533043
      -0.172528
      True
      False
      False
      NaT
      2010-07-14 12:02:00
      Wellington Road
      2
    
    
      14
      BikePoints_111
      Park Lane, Hyde Park
      001037
      BikePoint
      51.510017
      -0.157275
      True
      False
      False
      NaT
      2010-07-14 12:06:00
      Park Lane
      2
    
    
      15
      BikePoints_112
      Stonecutter Street, Holborn
      001061
      BikePoint
      51.515809
      -0.105270
      True
      False
      False
      NaT
      2010-07-14 13:51:00
      Stonecutter Street
      2
    
    
      16
      BikePoints_113
      Gloucester Road (Central), South Kensington
      003435
      BikePoint
      51.496462
      -0.183289
      True
      False
      False
      NaT
      2010-07-14 14:10:00
      Gloucester Road (Central)
      2
    
    
      17
      BikePoints_114
      Park Road (Baker Street), The Regents Park
      001050
      BikePoint
      51.524517
      -0.158963
      True
      False
      False
      NaT
      2010-07-14 15:05:00
      Park Road (Baker Street)
      2
    
    
      18
      BikePoints_115
      Braham Street, Aldgate
      001062
      BikePoint
      51.514233
      -0.073537
      True
      False
      False
      NaT
      2010-07-14 15:18:00
      Braham Street
      2
    
    
      19
      BikePoints_116
      Little Argyll Street, West End
      000995
      BikePoint
      51.514499
      -0.141423
      True
      False
      False
      NaT
      2010-07-14 15:46:00
      Little Argyll Street
      1
    
    
      20
      BikePoints_117
      Lollard Street, Vauxhall
      000998
      BikePoint
      51.492880
      -0.114934
      True
      False
      False
      NaT
      2010-07-14 16:17:00
      Lollard Street
      2
    
    
      21
      BikePoints_118
      Rochester Row, Westminster
      003457
      BikePoint
      51.495827
      -0.135478
      True
      False
      False
      NaT
      2010-07-14 16:22:00
      Rochester Row
      2
    
    
      22
      BikePoints_119
      Bath Street, St. Lukes
      000964
      BikePoint
      51.525893
      -0.090847
      True
      False
      False
      NaT
      2010-07-14 16:26:00
      Bath Street
      2
    
    
      23
      BikePoints_12
      Malet Street, Bloomsbury
      000980
      BikePoint
      51.521680
      -0.130431
      True
      False
      False
      NaT
      2010-07-05 14:37:00
      Malet Street
      1
    
    
      24
      BikePoints_120
      The Guildhall, Guildhall
      001044
      BikePoint
      51.515735
      -0.093080
      True
      False
      False
      NaT
      2010-07-15 09:44:00
      The Guildhall
      2
    
    
      25
      BikePoints_121
      Baker Street, Marylebone
      001086
      BikePoint
      51.518913
      -0.156166
      True
      False
      False
      NaT
      2010-07-15 10:20:00
      Baker Street
      2
    
    
      26
      BikePoints_122
      Norton Folgate, Liverpool Street
      001068
      BikePoint
      51.521113
      -0.078869
      True
      False
      False
      NaT
      2010-07-15 10:34:00
      Norton Folgate
      2
    
    
      27
      BikePoints_123
      St. John Street, Finsbury
      000992
      BikePoint
      51.528360
      -0.104724
      True
      False
      False
      NaT
      2010-07-15 10:55:00
      St. John Street
      2
    
    
      28
      BikePoints_124
      Eaton Square, Belgravia
      001069
      BikePoint
      51.496544
      -0.150905
      True
      False
      False
      NaT
      2010-07-15 10:59:00
      Eaton Square
      2
    
    
      29
      BikePoints_125
      Borough High Street, The Borough
      000996
      BikePoint
      51.500694
      -0.094524
      True
      False
      False
      NaT
      2010-07-15 11:10:00
      Borough High Street
      2
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      759
      BikePoints_808
      Stockwell Roundabout, Stockwell
      300207
      BikePoint
      51.473486
      -0.122555
      True
      False
      False
      NaT
      NaT
      Stockwell Roundabout
      NaN
    
    
      760
      BikePoints_809
      Lincolns Inn Fields, Holborn
      300240
      BikePoint
      51.516277
      -0.118272
      True
      False
      False
      NaT
      NaT
      Lincolns Inn Fields
      NaN
    
    
      761
      BikePoints_81
      Great Titchfield Street, Fitzrovia
      003450
      BikePoint
      51.520253
      -0.141327
      True
      False
      False
      NaT
      2010-07-13 09:23:00
      Great Titchfield Street
      2
    
    
      762
      BikePoints_810
      Tate Modern, Bankside
      300237
      BikePoint
      51.506725
      -0.098807
      True
      False
      False
      NaT
      2016-06-03 08:40:00
      Tate Modern
      NaN
    
    
      763
      BikePoints_811
      Westferry Circus, Canary Wharf
      300228
      BikePoint
      51.505703
      -0.027772
      True
      False
      False
      NaT
      NaT
      Westferry Circus
      NaN
    
    
      764
      BikePoints_814
      Clapham Road, Lingham Street, Stockwell
      300245
      BikePoint
      51.471433
      -0.123670
      True
      False
      False
      NaT
      2016-06-02 12:21:00
      Clapham Road
      NaN
    
    
      765
      BikePoints_814
      Clapham Road, Lingham Street, Stockwell
      300245
      BikePoint
      51.471433
      -0.123670
      True
      False
      False
      NaT
      NaT
      Clapham Road
      NaN
    
    
      766
      BikePoints_815
      Lambeth Palace Road, Waterloo
      300231
      BikePoint
      51.500089
      -0.116628
      True
      False
      False
      NaT
      2016-05-04 10:28:00
      Lambeth Palace Road
      NaN
    
    
      767
      BikePoints_817
      Riverlight South, Nine Elms
      300232
      BikePoint
      51.481335
      -0.138212
      True
      False
      False
      NaT
      2016-06-03 10:19:00
      Riverlight South
      NaN
    
    
      768
      BikePoints_818
      One Tower Bridge, Bermondsey
      300249
      BikePoint
      51.503127
      -0.078655
      True
      False
      False
      NaT
      NaT
      One Tower Bridge
      NaN
    
    
      769
      BikePoints_818
      One Tower Bridge, Southwark
      300249
      BikePoint
      51.503127
      -0.078655
      True
      False
      False
      NaT
      NaT
      One Tower Bridge
      NaN
    
    
      770
      BikePoints_82
      Chancery Lane, Holborn
      003453
      BikePoint
      51.514274
      -0.111257
      True
      False
      False
      NaT
      2010-07-13 10:08:00
      Chancery Lane
      2
    
    
      771
      BikePoints_83
      Panton Street, West End
      003452
      BikePoint
      51.509639
      -0.131510
      True
      False
      False
      NaT
      2010-07-13 10:10:00
      Panton Street
      2
    
    
      772
      BikePoints_84
      Breams Buildings, Holborn
      003449
      BikePoint
      51.515937
      -0.111778
      True
      False
      False
      NaT
      2010-07-13 11:24:00
      Breams Buildings
      2
    
    
      773
      BikePoints_85
      Tanner Street, Bermondsey
      000994
      BikePoint
      51.500647
      -0.078600
      True
      False
      False
      NaT
      2010-07-13 13:01:00
      Tanner Street
      2
    
    
      774
      BikePoints_86
      Sancroft Street, Vauxhall
      003434
      BikePoint
      51.489479
      -0.115156
      True
      False
      False
      NaT
      2010-07-13 13:19:00
      Sancroft Street
      2
    
    
      775
      BikePoints_87
      Devonshire Square, Liverpool Street
      003438
      BikePoint
      51.516468
      -0.079684
      True
      False
      False
      NaT
      2010-07-13 13:28:00
      Devonshire Square
      2
    
    
      776
      BikePoints_88
      Bayley Street, Bloomsbury
      001006
      BikePoint
      51.518587
      -0.132053
      True
      False
      False
      NaT
      2010-07-13 13:38:00
      Bayley Street
      2
    
    
      777
      BikePoints_89
      Tavistock Place, Bloomsbury
      003439
      BikePoint
      51.526250
      -0.123509
      True
      False
      False
      NaT
      2010-07-13 13:56:00
      Tavistock Place
      2
    
    
      778
      BikePoints_9
      New Globe Walk, Bankside
      001015
      BikePoint
      51.507385
      -0.096440
      True
      False
      False
      NaT
      2010-07-04 11:19:00
      New Globe Walk
      2
    
    
      779
      BikePoints_90
      Harrington Square 1, Camden Town
      001038
      BikePoint
      51.533019
      -0.139174
      True
      False
      False
      NaT
      2010-07-13 14:07:00
      Harrington Square
      2
    
    
      780
      BikePoints_91
      Walnut Tree Walk, Vauxhall
      001076
      BikePoint
      51.493686
      -0.111014
      True
      False
      False
      NaT
      2010-07-13 15:59:00
      Walnut Tree Walk
      2
    
    
      781
      BikePoints_92
      Borough Road, Elephant and Castle
      001082
      BikePoint
      51.498898
      -0.100440
      True
      False
      False
      NaT
      2010-07-13 16:04:00
      Borough Road
      2
    
    
      782
      BikePoints_93
      Cloudesley Road, Angel
      002586
      BikePoint
      51.534408
      -0.109025
      True
      False
      False
      NaT
      2010-07-13 16:16:00
      Cloudesley Road
      2
    
    
      783
      BikePoints_94
      Bricklayers Arms, Borough
      001070
      BikePoint
      51.495061
      -0.085814
      True
      False
      False
      NaT
      2010-07-13 16:27:00
      Bricklayers Arms
      2
    
    
      784
      BikePoints_95
      Aldersgate Street, Barbican
      001065
      BikePoint
      51.520841
      -0.097340
      True
      False
      False
      NaT
      2010-07-14 08:36:00
      Aldersgate Street
      1
    
    
      785
      BikePoints_96
      Falkirk Street, Hoxton
      001047
      BikePoint
      51.530950
      -0.078505
      True
      False
      False
      NaT
      2010-07-14 08:43:00
      Falkirk Street
      2
    
    
      786
      BikePoints_97
      Gloucester Road (North), Kensington
      003447
      BikePoint
      51.497924
      -0.183834
      True
      False
      False
      NaT
      2010-07-14 08:53:00
      Gloucester Road (North)
      2
    
    
      787
      BikePoints_98
      Hampstead Road, Euston
      000972
      BikePoint
      51.525542
      -0.138231
      True
      False
      False
      NaT
      2010-07-14 09:18:00
      Hampstead Road
      2
    
    
      788
      BikePoints_99
      Old Quebec Street, Marylebone
      001085
      BikePoint
      51.514577
      -0.158264
      True
      False
      False
      NaT
      2010-07-14 09:28:00
      Old Quebec Street
      2
    
  

789 rows × 13 columns

Consistent Data

Stations Analysis

Overview



In [18]:

    
stations_dataset.shape









    Out[18]:





(789, 13)



In [19]:

    
stations_dataset.info(memory_usage='deep')









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 789 entries, 0 to 788
Data columns (total 13 columns):
Id              789 non-null object
Name            789 non-null object
TerminalName    789 non-null object
PlaceType       789 non-null object
Latitude        789 non-null float64
Longitude       789 non-null float64
Installed       789 non-null bool
Temporary       789 non-null bool
Locked          789 non-null bool
RemovalDate     3 non-null datetime64[ns]
InstallDate     690 non-null datetime64[ns]
ShortName       789 non-null object
Priority        734 non-null object
dtypes: bool(3), datetime64[ns](2), float64(2), object(6)
memory usage: 517.0 KB



In [20]:

    
stations_dataset.head()









    Out[20]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      0
      BikePoints_1
      River Street, Clerkenwell
      001023
      BikePoint
      51.529163
      -0.109970
      True
      False
      False
      NaT
      2010-07-12 15:08:00
      River Street
      2
    
    
      1
      BikePoints_10
      Park Street, Bankside
      001024
      BikePoint
      51.505974
      -0.092754
      True
      False
      False
      NaT
      2010-07-04 11:21:00
      Park Street
      2
    
    
      2
      BikePoints_100
      Albert Embankment, Vauxhall
      001059
      BikePoint
      51.490435
      -0.122806
      True
      False
      False
      NaT
      2010-07-14 09:31:00
      Albert Embankment
      2
    
    
      3
      BikePoints_101
      Queen Street 1, Bank
      000999
      BikePoint
      51.511553
      -0.092940
      True
      False
      False
      NaT
      2010-07-14 10:18:00
      Queen Street
      1
    
    
      4
      BikePoints_102
      Jewry Street, Aldgate
      001045
      BikePoint
      51.513406
      -0.076793
      True
      False
      False
      NaT
      2010-07-14 10:21:00
      Jewry Street
      2



In [21]:

    
stations_dataset.describe()



In [22]:

    
stations_dataset.apply(lambda x:x.nunique())









    Out[22]:





Id              780
Name            782
TerminalName    780
PlaceType         1
Latitude        778
Longitude       778
Installed         2
Temporary         1
Locked            2
RemovalDate       3
InstallDate     686
ShortName       770
Priority          2
dtype: int64



In [23]:

    
stations_dataset.isnull().sum()









    Out[23]:





Id                0
Name              0
TerminalName      0
PlaceType         0
Latitude          0
Longitude         0
Installed         0
Temporary         0
Locked            0
RemovalDate     786
InstallDate      99
ShortName         0
Priority         55
dtype: int64

Observations:

Id, Name and Terminal name seem to be candidate keys
The minimum latitude and the maximum longitude are 0
Some stations have the same latitude or longitude
Id, TerminalName and Name have different unique values
Placetype, Installed, Temporary and Locked appear to be constant
Some stations do not have an install date
Some Stations have a removal date (very sparse)

Remove Duplicate Stations



In [24]:

    
def find_duplicate_ids(df):
    """Find Ids that have more than one value in the given columns"""
    
    df = df.drop_duplicates()
    value_counts_grouped_by_id = df.groupby('Id').count()    
    is_duplicate_id = value_counts_grouped_by_id.applymap(lambda x: x > 1).any(axis=1)
    duplicate_ids = value_counts_grouped_by_id[is_duplicate_id == True].index.values
    return df[df['Id'].isin(duplicate_ids)]

diplicate_ids = find_duplicate_ids(stations_dataset)
diplicate_ids









    Out[24]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      150
      BikePoints_237
      Dock Street, Wapping
      003467
      BikePoint
      51.509786
      -0.068161
      True
      False
      False
      NaT
      2010-07-22 11:44:00
      Dock Street
      NaN
    
    
      151
      BikePoints_237
      Dock Street, Wapping
      003467
      BikePoint
      51.509786
      -0.068161
      True
      False
      False
      NaT
      2010-07-20 11:44:00
      Dock Street
      NaN
    
    
      417
      BikePoints_497
      Merchant Street, Bow
      200242
      BikePoint
      51.526535
      -0.028619
      True
      False
      False
      NaT
      2012-01-24 09:47:00
      Merchant Street
      NaN
    
    
      418
      BikePoints_497
      Merchant Street, Bow
      200242
      BikePoint
      51.526177
      -0.027467
      True
      False
      False
      NaT
      2012-01-24 09:47:00
      Merchant Street
      NaN
    
    
      726
      BikePoints_780
      Imperial Wharf Station
      300070
      BikePoint
      51.474665
      -0.183165
      True
      False
      False
      NaT
      2015-08-13 08:40:00
      Imperial Wharf Station
      2
    
    
      727
      BikePoints_780
      Imperial Wharf Station, Sands End
      300070
      BikePoint
      51.474665
      -0.183165
      True
      False
      False
      NaT
      2015-08-13 08:40:00
      Imperial Wharf Station
      2
    
    
      743
      BikePoints_796
      Coram Street, Bloomsbury
      300201
      BikePoint
      51.524000
      -0.126409
      True
      False
      True
      NaT
      2016-02-29 11:47:00
      Coram Street
      NaN
    
    
      744
      BikePoints_796
      Coram Street, Bloomsbury
      300201
      BikePoint
      51.524000
      -0.126409
      True
      False
      False
      NaT
      2016-02-29 11:47:00
      Coram Street
      NaN
    
    
      745
      BikePoints_798
      Birkenhead Street, Kings Cross
      300212
      BikePoint
      51.530199
      0.122299
      True
      False
      False
      NaT
      NaT
      Birkenhead Street
      NaN
    
    
      746
      BikePoints_798
      Birkenhead Street, Kings Cross
      300212
      BikePoint
      51.530199
      -0.122299
      True
      False
      False
      NaT
      NaT
      Birkenhead Street
      NaN
    
    
      747
      BikePoints_799
      Kings Gate House, Westminster
      300202
      BikePoint
      51.497698
      -0.137598
      True
      False
      False
      NaT
      NaT
      Kings Gate House
      NaN
    
    
      748
      BikePoints_799
      Kings Gate House, Westminster
      300202
      BikePoint
      51.497698
      -0.137598
      True
      False
      False
      NaT
      2016-06-02 14:08:00
      Kings Gate House
      NaN
    
    
      753
      BikePoints_802
      Albert Square, Stockwell
      300209
      BikePoint
      51.476590
      -0.118256
      True
      False
      False
      NaT
      2016-06-02 11:05:00
      Albert Square
      NaN
    
    
      754
      BikePoints_802
      Albert Square, Stockwell
      300209
      BikePoint
      51.476590
      -0.118256
      True
      False
      False
      NaT
      NaT
      Albert Square
      NaN
    
    
      764
      BikePoints_814
      Clapham Road, Lingham Street, Stockwell
      300245
      BikePoint
      51.471433
      -0.123670
      True
      False
      False
      NaT
      2016-06-02 12:21:00
      Clapham Road
      NaN
    
    
      765
      BikePoints_814
      Clapham Road, Lingham Street, Stockwell
      300245
      BikePoint
      51.471433
      -0.123670
      True
      False
      False
      NaT
      NaT
      Clapham Road
      NaN
    
    
      768
      BikePoints_818
      One Tower Bridge, Bermondsey
      300249
      BikePoint
      51.503127
      -0.078655
      True
      False
      False
      NaT
      NaT
      One Tower Bridge
      NaN
    
    
      769
      BikePoints_818
      One Tower Bridge, Southwark
      300249
      BikePoint
      51.503127
      -0.078655
      True
      False
      False
      NaT
      NaT
      One Tower Bridge
      NaN

Given these records have the same location and Id but different Name or TerminalName, we'll assume the station changed name and remove the first entries.



In [25]:

    
# remove the one not in merchant street
stations_dataset.drop(417, inplace=True)

# remove the one with the shortest name
stations_dataset.drop(726, inplace=True)

# remove the one that is not in kings cross (as the name of the station implies)
stations_dataset.drop(745, inplace=True)

# remove the duplicated entries 
stations_dataset.drop([747, 743, 151, 754, 765, 768],  inplace=True)



In [26]:

    
# make sure there are no repeated ids 
assert len(find_duplicate_ids(stations_dataset)) == 0

Check Locations

Let's have a closer look at the station locations. All of them should be in Greater London.



In [27]:

    
def find_locations_outside_box(locations, min_longitude, min_latitude, max_longitude, max_latitude):
    latitude_check = ~(locations['Latitude'] >= min_latitude) & (locations['Latitude'] <= max_latitude) 
    longitude_check = ~(locations['Longitude'] >= min_longitude) & (locations['Longitude'] <= max_longitude) 
    return locations[(latitude_check | longitude_check)]

outlier_locations_df = find_locations_outside_box(stations_dataset, lon_min_longitude, lon_min_latitude, 
                                                  lon_max_longitude, lon_max_latitude)
outlier_locations_df









    Out[27]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      738
      BikePoints_791
      Test Desktop
      666666
      BikePoint
      0.0
      0.0
      False
      False
      False
      NaT
      2016-01-15 12:39:00
      Test Desktop
      NaN

This station looks like a test dation, so we'll remove it.



In [28]:

    
outlier_locations_idx = outlier_locations_df.index.values

stations_dataset.drop(outlier_locations_idx, inplace=True)



In [29]:

    
# make sure there are no stations outside London
assert len(find_locations_outside_box(stations_dataset, lon_min_longitude, lon_min_latitude, 
                                      lon_max_longitude, lon_max_latitude)) == 0

We will investigate the fact that there are stations with duplicate latitude or longitude values.



In [30]:

    
# find stations with duplicate longitude
id_counts_groupedby_longitude = stations_dataset.groupby('Longitude')['Id'].count()
nonunique_longitudes = id_counts_groupedby_longitude[id_counts_groupedby_longitude != 1].index.values
nonunique_longitude_stations = stations_dataset[stations_dataset['Longitude'].isin(nonunique_longitudes)].sort_values(by=['Longitude'])

id_counts_groupedby_latitude = stations_dataset.groupby('Latitude')['Id'].count()
nonunique_latitudes = id_counts_groupedby_latitude[id_counts_groupedby_latitude != 1].index.values
nonunique_latitudes_stations = stations_dataset[stations_dataset['Latitude'].isin(nonunique_latitudes)].sort_values(by=['Latitude'])

nonunique_coordinates_stations = pd.concat([nonunique_longitude_stations, nonunique_latitudes_stations])
nonunique_coordinates_stations









    Out[30]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      127
      BikePoints_216
      Old Brompton Road, South Kensington
      003479
      BikePoint
      51.490945
      -0.181190
      True
      False
      False
      NaT
      2010-07-19 11:12:00
      Old Brompton Road
      2
    
    
      500
      BikePoints_573
      Limerston Street, West Chelsea
      200001
      BikePoint
      51.485587
      -0.181190
      True
      False
      False
      NaT
      2012-03-15 07:21:00
      Limerston Street
      2
    
    
      120
      BikePoints_21
      Hampstead Road (Cartmel), Euston
      003426
      BikePoint
      51.530078
      -0.138846
      True
      False
      False
      NaT
      2010-07-06 14:49:00
      Hampstead Road (Cartmel)
      2
    
    
      237
      BikePoints_318
      Sackville Street, Mayfair
      001197
      BikePoint
      51.510048
      -0.138846
      True
      False
      False
      NaT
      2010-07-23 11:42:00
      Sackville Street
      2
    
    
      10
      BikePoints_108
      Abbey Orchard Street, Westminster
      003429
      BikePoint
      51.498125
      -0.132102
      True
      False
      False
      NaT
      2010-07-14 11:42:00
      Abbey Orchard Street
      1
    
    
      554
      BikePoints_624
      Courland Grove, Wandsworth Road
      200173
      BikePoint
      51.472918
      -0.132102
      True
      False
      False
      NaT
      2013-10-08 09:24:00
      Courland Grove
      2
    
    
      3
      BikePoints_101
      Queen Street 1, Bank
      000999
      BikePoint
      51.511553
      -0.092940
      True
      False
      False
      NaT
      2010-07-14 10:18:00
      Queen Street
      1
    
    
      345
      BikePoints_427
      Cheapside, Bank
      022180
      BikePoint
      51.513970
      -0.092940
      True
      False
      False
      NaT
      2011-07-15 10:28:00
      Cheapside
      1
    
    
      10
      BikePoints_108
      Abbey Orchard Street, Westminster
      003429
      BikePoint
      51.498125
      -0.132102
      True
      False
      False
      NaT
      2010-07-14 11:42:00
      Abbey Orchard Street
      1
    
    
      393
      BikePoints_474
      Castalia Square, Cubitt Town
      200155
      BikePoint
      51.498125
      -0.011457
      True
      False
      False
      NaT
      2012-01-17 17:56:00
      Castalia Square
      2
    
    
      386
      BikePoints_468
      Cantrell Road, Bow
      200150
      BikePoint
      51.521564
      -0.022694
      True
      False
      False
      NaT
      2012-01-12 13:42:00
      Cantrell Road
      2
    
    
      476
      BikePoints_550
      Harford Street, Mile End
      200102
      BikePoint
      51.521564
      -0.039264
      True
      False
      False
      NaT
      NaT
      Harford Street
      2
    
    
      247
      BikePoints_327
      New North Road 1, Hoxton
      001128
      BikePoint
      51.530950
      -0.085603
      True
      False
      False
      NaT
      2010-07-26 08:37:00
      New North Road
      2
    
    
      785
      BikePoints_96
      Falkirk Street, Hoxton
      001047
      BikePoint
      51.530950
      -0.078505
      True
      False
      False
      NaT
      2010-07-14 08:43:00
      Falkirk Street
      2



In [31]:

    
def draw_stations_map(stations_df):    
    stations_map = create_london_map()

    for index, station in stations_df.iterrows():        
        folium.Marker([station['Latitude'],station['Longitude']], popup=station['Name']).add_to(stations_map)
    
    return stations_map



In [32]:

    
draw_stations_map(nonunique_coordinates_stations)









    Out[32]:

We can observe that the stations are different and that having the same Longitude is just a coincidence.

Let's plot all the stations in a map to see how it looks



In [33]:

    
london_longitude = -0.127722
london_latitude = 51.507981

MAX_RECORDS = 100

stations_map = create_london_map()

for index, station in stations_dataset[0:MAX_RECORDS].iterrows():
    folium.Marker([station['Latitude'],station['Longitude']], popup=station['Name']).add_to(stations_map)
    
stations_map

#folium.Map.save(stations_map, 'reports/maps/stations_map.html')









    Out[33]:

Readings Analysis

Overview



In [34]:

    
readings_dataset.shape









    Out[34]:





(1529937, 6)



In [35]:

    
readings_dataset.info(memory_usage='deep')









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 1529937 entries, 750 to 292193
Data columns (total 6 columns):
Id                 1529937 non-null object
Timestamp          1529937 non-null datetime64[ns, UTC]
NbBikes            1529937 non-null uint16
NbDocks            1529937 non-null uint16
NbEmptyDocks       1529937 non-null uint16
NbUnusableDocks    1529937 non-null uint16
dtypes: datetime64[ns, UTC](1), object(1), uint16(4)
memory usage: 203.4 MB



In [36]:

    
readings_dataset.head()









    Out[36]:






  
    
      
      Id
      Timestamp
      NbBikes
      NbDocks
      NbEmptyDocks
      NbUnusableDocks
    
  
  
    
      750
      BikePoints_791
      2016-05-10 15:34:07.137000+00:00
      0
      0
      0
      0
    
    
      608
      BikePoints_646
      2016-05-14 20:36:22.417000+00:00
      0
      0
      0
      0
    
    
      570
      BikePoints_608
      2016-05-14 23:18:18.467000+00:00
      14
      29
      15
      0
    
    
      666
      BikePoints_704
      2016-05-15 00:50:38.140000+00:00
      9
      18
      9
      0
    
    
      634
      BikePoints_672
      2016-05-15 04:11:04.447000+00:00
      28
      28
      0
      0



In [37]:

    
readings_dataset.describe()









    Out[37]:






  
    
      
      NbBikes
      NbDocks
      NbEmptyDocks
      NbUnusableDocks
    
  
  
    
      count
      1.529937e+06
      1.529937e+06
      1.529937e+06
      1.529937e+06
    
    
      mean
      1.239987e+01
      2.701473e+01
      1.404461e+01
      5.702444e-01
    
    
      std
      9.076213e+00
      9.518489e+00
      9.577277e+00
      9.018525e-01
    
    
      min
      0.000000e+00
      0.000000e+00
      0.000000e+00
      0.000000e+00
    
    
      25%
      5.000000e+00
      2.000000e+01
      7.000000e+00
      0.000000e+00
    
    
      50%
      1.100000e+01
      2.500000e+01
      1.300000e+01
      0.000000e+00
    
    
      75%
      1.800000e+01
      3.300000e+01
      1.900000e+01
      1.000000e+00
    
    
      max
      6.400000e+01
      6.400000e+01
      6.400000e+01
      3.800000e+01



In [38]:

    
readings_dataset.apply(lambda x:x.nunique())









    Out[38]:





Id                   780
Timestamp          13236
NbBikes               65
NbDocks               58
NbEmptyDocks          65
NbUnusableDocks       20
dtype: int64



In [39]:

    
readings_dataset.isnull().sum()









    Out[39]:





Id                 0
Timestamp          0
NbBikes            0
NbDocks            0
NbEmptyDocks       0
NbUnusableDocks    0
dtype: int64



In [40]:

    
timestamps = readings_dataset['Timestamp']
ax = timestamps.groupby([timestamps.dt.year, timestamps.dt.month, timestamps.dt.day]).count().plot(kind="bar")
ax.set_xlabel('Date')
ax.set_title('Readings per Day')









    Out[40]:





<matplotlib.text.Text at 0x7fb428d77f90>

Observations:

The number of readings in each day varies widely

Discard Out of Range Data



In [41]:

    
start_date = date(2016, 5, 15)
end_date = date(2016, 6, 27)
days = set(pd.date_range(start=start_date, end=end_date, closed='left'))
           
readings_dataset = readings_dataset[(timestamps > start_date) & (timestamps < end_date)]

Readings Consistency Through Days

Lets get some insight about which stations do not have readings during an entire day



In [42]:

    
# get a subview of the readings dataset
id_timestamp_view = readings_dataset.loc[:,['Id','Timestamp']]

# remove the time component of the timestamp
id_timestamp_view['Timestamp'] = id_timestamp_view['Timestamp'].apply(lambda x: x.replace(hour=0, minute=0, second=0, microsecond=0))

# compute the days of readings per stations
days_readings = id_timestamp_view.groupby('Id').aggregate(lambda x: set(x))
days_readings['MissingDays'] = days_readings['Timestamp'].apply(lambda x: list(days - x))
days_readings['MissingDaysCount'] = days_readings['MissingDays'].apply(lambda x: len(x))



In [43]:

    
pickle.dump(days_readings.query('MissingDaysCount > 0'), open("data/parsed/missing_days.p", "wb"))



In [44]:

    
def expand_datetime(df, datetime_col):
    df['Weekday'] = df[datetime_col].apply(lambda x: x.weekday())
    return df



In [45]:

    
# get the stations with missing readings only
missing_days_readings = days_readings[days_readings['MissingDaysCount'] != 0]
missing_days_readings = missing_days_readings['MissingDays'].apply(lambda x: pd.Series(x)).unstack().dropna()
missing_days_readings.index = missing_days_readings.index.droplevel()

# sort and format in their own DF
missing_days_readings = pd.DataFrame(missing_days_readings, columns=['MissingDay'], index=None).reset_index().sort_values(by=['Id', 'MissingDay'])

# expand the missing day date
expand_datetime(missing_days_readings, 'MissingDay')









    Out[45]:






  
    
      
      Id
      MissingDay
      Weekday
    
  
  
    
      0
      BikePoints_109
      2016-06-25
      5
    
    
      1
      BikePoints_112
      2016-05-25
      2
    
    
      53
      BikePoints_112
      2016-05-26
      3
    
    
      54
      BikePoints_120
      2016-06-10
      4
    
    
      2
      BikePoints_120
      2016-06-11
      5
    
    
      3
      BikePoints_129
      2016-06-25
      5
    
    
      55
      BikePoints_133
      2016-06-24
      4
    
    
      4
      BikePoints_133
      2016-06-25
      5
    
    
      91
      BikePoints_133
      2016-06-26
      6
    
    
      5
      BikePoints_153
      2016-06-17
      4
    
    
      258
      BikePoints_153
      2016-06-18
      5
    
    
      209
      BikePoints_153
      2016-06-19
      6
    
    
      155
      BikePoints_153
      2016-06-20
      0
    
    
      92
      BikePoints_153
      2016-06-21
      1
    
    
      281
      BikePoints_153
      2016-06-22
      2
    
    
      234
      BikePoints_153
      2016-06-23
      3
    
    
      183
      BikePoints_153
      2016-06-24
      4
    
    
      124
      BikePoints_153
      2016-06-25
      5
    
    
      56
      BikePoints_153
      2016-06-26
      6
    
    
      6
      BikePoints_184
      2016-06-25
      5
    
    
      7
      BikePoints_192
      2016-06-25
      5
    
    
      8
      BikePoints_218
      2016-06-04
      5
    
    
      9
      BikePoints_226
      2016-05-15
      6
    
    
      57
      BikePoints_226
      2016-05-16
      0
    
    
      490
      BikePoints_237
      2016-05-15
      6
    
    
      514
      BikePoints_237
      2016-05-16
      0
    
    
      465
      BikePoints_237
      2016-05-17
      1
    
    
      391
      BikePoints_237
      2016-05-18
      2
    
    
      259
      BikePoints_237
      2016-05-19
      3
    
    
      10
      BikePoints_237
      2016-05-20
      4
    
    
      ...
      ...
      ...
      ...
    
    
      437
      BikePoints_817
      2016-06-17
      4
    
    
      595
      BikePoints_817
      2016-06-18
      5
    
    
      181
      BikePoints_817
      2016-06-19
      6
    
    
      120
      BikePoints_817
      2016-06-20
      0
    
    
      322
      BikePoints_817
      2016-06-21
      1
    
    
      521
      BikePoints_817
      2016-06-22
      2
    
    
      577
      BikePoints_817
      2016-06-23
      3
    
    
      497
      BikePoints_817
      2016-06-24
      4
    
    
      87
      BikePoints_817
      2016-06-25
      5
    
    
      301
      BikePoints_817
      2016-06-26
      6
    
    
      50
      BikePoints_818
      2016-05-15
      6
    
    
      88
      BikePoints_818
      2016-05-17
      1
    
    
      121
      BikePoints_818
      2016-05-18
      2
    
    
      182
      BikePoints_86
      2016-05-28
      5
    
    
      280
      BikePoints_86
      2016-05-29
      6
    
    
      342
      BikePoints_86
      2016-05-30
      0
    
    
      122
      BikePoints_86
      2016-05-31
      1
    
    
      257
      BikePoints_86
      2016-06-01
      2
    
    
      208
      BikePoints_86
      2016-06-02
      3
    
    
      89
      BikePoints_86
      2016-06-03
      4
    
    
      359
      BikePoints_86
      2016-06-04
      5
    
    
      302
      BikePoints_86
      2016-06-05
      6
    
    
      233
      BikePoints_86
      2016-06-06
      0
    
    
      153
      BikePoints_86
      2016-06-07
      1
    
    
      51
      BikePoints_86
      2016-06-08
      2
    
    
      323
      BikePoints_86
      2016-06-09
      3
    
    
      52
      BikePoints_9
      2016-06-09
      3
    
    
      123
      BikePoints_9
      2016-06-10
      4
    
    
      90
      BikePoints_9
      2016-06-11
      5
    
    
      154
      BikePoints_9
      2016-06-12
      6
    
  

607 rows × 3 columns



In [46]:

    
missing_days_readings









    Out[46]:






  
    
      
      Id
      MissingDay
      Weekday
    
  
  
    
      0
      BikePoints_109
      2016-06-25
      5
    
    
      1
      BikePoints_112
      2016-05-25
      2
    
    
      53
      BikePoints_112
      2016-05-26
      3
    
    
      54
      BikePoints_120
      2016-06-10
      4
    
    
      2
      BikePoints_120
      2016-06-11
      5
    
    
      3
      BikePoints_129
      2016-06-25
      5
    
    
      55
      BikePoints_133
      2016-06-24
      4
    
    
      4
      BikePoints_133
      2016-06-25
      5
    
    
      91
      BikePoints_133
      2016-06-26
      6
    
    
      5
      BikePoints_153
      2016-06-17
      4
    
    
      258
      BikePoints_153
      2016-06-18
      5
    
    
      209
      BikePoints_153
      2016-06-19
      6
    
    
      155
      BikePoints_153
      2016-06-20
      0
    
    
      92
      BikePoints_153
      2016-06-21
      1
    
    
      281
      BikePoints_153
      2016-06-22
      2
    
    
      234
      BikePoints_153
      2016-06-23
      3
    
    
      183
      BikePoints_153
      2016-06-24
      4
    
    
      124
      BikePoints_153
      2016-06-25
      5
    
    
      56
      BikePoints_153
      2016-06-26
      6
    
    
      6
      BikePoints_184
      2016-06-25
      5
    
    
      7
      BikePoints_192
      2016-06-25
      5
    
    
      8
      BikePoints_218
      2016-06-04
      5
    
    
      9
      BikePoints_226
      2016-05-15
      6
    
    
      57
      BikePoints_226
      2016-05-16
      0
    
    
      490
      BikePoints_237
      2016-05-15
      6
    
    
      514
      BikePoints_237
      2016-05-16
      0
    
    
      465
      BikePoints_237
      2016-05-17
      1
    
    
      391
      BikePoints_237
      2016-05-18
      2
    
    
      259
      BikePoints_237
      2016-05-19
      3
    
    
      10
      BikePoints_237
      2016-05-20
      4
    
    
      ...
      ...
      ...
      ...
    
    
      437
      BikePoints_817
      2016-06-17
      4
    
    
      595
      BikePoints_817
      2016-06-18
      5
    
    
      181
      BikePoints_817
      2016-06-19
      6
    
    
      120
      BikePoints_817
      2016-06-20
      0
    
    
      322
      BikePoints_817
      2016-06-21
      1
    
    
      521
      BikePoints_817
      2016-06-22
      2
    
    
      577
      BikePoints_817
      2016-06-23
      3
    
    
      497
      BikePoints_817
      2016-06-24
      4
    
    
      87
      BikePoints_817
      2016-06-25
      5
    
    
      301
      BikePoints_817
      2016-06-26
      6
    
    
      50
      BikePoints_818
      2016-05-15
      6
    
    
      88
      BikePoints_818
      2016-05-17
      1
    
    
      121
      BikePoints_818
      2016-05-18
      2
    
    
      182
      BikePoints_86
      2016-05-28
      5
    
    
      280
      BikePoints_86
      2016-05-29
      6
    
    
      342
      BikePoints_86
      2016-05-30
      0
    
    
      122
      BikePoints_86
      2016-05-31
      1
    
    
      257
      BikePoints_86
      2016-06-01
      2
    
    
      208
      BikePoints_86
      2016-06-02
      3
    
    
      89
      BikePoints_86
      2016-06-03
      4
    
    
      359
      BikePoints_86
      2016-06-04
      5
    
    
      302
      BikePoints_86
      2016-06-05
      6
    
    
      233
      BikePoints_86
      2016-06-06
      0
    
    
      153
      BikePoints_86
      2016-06-07
      1
    
    
      51
      BikePoints_86
      2016-06-08
      2
    
    
      323
      BikePoints_86
      2016-06-09
      3
    
    
      52
      BikePoints_9
      2016-06-09
      3
    
    
      123
      BikePoints_9
      2016-06-10
      4
    
    
      90
      BikePoints_9
      2016-06-11
      5
    
    
      154
      BikePoints_9
      2016-06-12
      6
    
  

607 rows × 3 columns



In [47]:

    
missing_days_readings['Id'].nunique()









    Out[47]:





53



In [48]:

    
# plot the missing readings days 
days = missing_days_readings['MissingDay']
missing_days_counts = days.groupby([days.dt.year, days.dt.month, days.dt.day]).count()
ax = missing_days_counts.plot(kind="bar")
ax.set_xlabel('Date')
ax.set_ylabel('Number of Stations')









    Out[48]:





<matplotlib.text.Text at 0x7fb41edea710>

Stations with no readings in at least one day



In [49]:

    
missing_days_readings_stations = stations_dataset[stations_dataset['Id'].isin(missing_days_readings['Id'].unique())]
draw_stations_map(missing_days_readings_stations)









    Out[49]:

Stations with no readings in at least one day during the weekend



In [50]:

    
weekend_readings = missing_days_readings[missing_days_readings['Weekday'] > 4]
missing_dayreadings_stn = stations_dataset[stations_dataset['Id'].isin(weekend_readings['Id'].unique())]
draw_stations_map(missing_dayreadings_stn)









    Out[50]:

Stations with no readings in at least one day during weekdays



In [51]:

    
weekday_readings = missing_days_readings[missing_days_readings['Weekday'] < 5]
missing_dayreadings_stn = stations_dataset[stations_dataset['Id'].isin(weekday_readings['Id'].unique())]
draw_stations_map(missing_dayreadings_stn)









    Out[51]:

Observations:

There are 29 stations that do not have readings in at least one day
There were more stations without readings during May than in June
Other than that, there is no visible pattern

Build Datasets

Readings



In [59]:

    
stations_to_remove = set(readings_dataset.Id) - set(stations_dataset.Id)



In [60]:

    
readings_dataset = readings_dataset[~readings_dataset.Id.isin(stations_to_remove)]



In [62]:

    
readings_dataset.reset_index(inplace=True, drop=True)



In [63]:

    
readings_dataset.head()









    Out[63]:






  
    
      
      Id
      Timestamp
      NbBikes
      NbDocks
      NbEmptyDocks
      NbUnusableDocks
    
  
  
    
      0
      BikePoints_704
      2016-05-15 00:50:38.140000+00:00
      9
      18
      9
      0
    
    
      1
      BikePoints_672
      2016-05-15 04:11:04.447000+00:00
      28
      28
      0
      0
    
    
      2
      BikePoints_555
      2016-05-15 08:21:32.870000+00:00
      16
      56
      40
      0
    
    
      3
      BikePoints_759
      2016-05-15 09:51:44.977000+00:00
      0
      18
      18
      0
    
    
      4
      BikePoints_8
      2016-05-15 10:11:48.467000+00:00
      15
      18
      3
      0



In [65]:

    
readings_dataset.describe()









    Out[65]:






  
    
      
      NbBikes
      NbDocks
      NbEmptyDocks
      NbUnusableDocks
    
  
  
    
      count
      1.500921e+06
      1.500921e+06
      1.500921e+06
      1.500921e+06
    
    
      mean
      1.240701e+01
      2.701745e+01
      1.403816e+01
      5.722786e-01
    
    
      std
      9.087195e+00
      9.518367e+00
      9.583962e+00
      9.030175e-01
    
    
      min
      0.000000e+00
      0.000000e+00
      0.000000e+00
      0.000000e+00
    
    
      25%
      5.000000e+00
      2.000000e+01
      7.000000e+00
      0.000000e+00
    
    
      50%
      1.100000e+01
      2.500000e+01
      1.300000e+01
      0.000000e+00
    
    
      75%
      1.800000e+01
      3.300000e+01
      1.900000e+01
      1.000000e+00
    
    
      max
      6.400000e+01
      6.400000e+01
      6.400000e+01
      3.800000e+01



In [66]:

    
readings_dataset.info(memory_usage='deep')









    



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1500921 entries, 0 to 1500920
Data columns (total 6 columns):
Id                 1500921 non-null object
Timestamp          1500921 non-null datetime64[ns, UTC]
NbBikes            1500921 non-null uint16
NbDocks            1500921 non-null uint16
NbEmptyDocks       1500921 non-null uint16
NbUnusableDocks    1500921 non-null uint16
dtypes: datetime64[ns, UTC](1), object(1), uint16(4)
memory usage: 188.0 MB



In [67]:

    
pickle.dump(readings_dataset, open("data/parsed/readings_dataset_utc.p", "wb"))

Stations



In [68]:

    
stations_dataset.reset_index(inplace=True, drop=True)



In [69]:

    
stations_dataset.head()









    Out[69]:






  
    
      
      Id
      Name
      TerminalName
      PlaceType
      Latitude
      Longitude
      Installed
      Temporary
      Locked
      RemovalDate
      InstallDate
      ShortName
      Priority
    
  
  
    
      0
      BikePoints_1
      River Street, Clerkenwell
      001023
      BikePoint
      51.529163
      -0.109970
      True
      False
      False
      NaT
      2010-07-12 15:08:00
      River Street
      2
    
    
      1
      BikePoints_10
      Park Street, Bankside
      001024
      BikePoint
      51.505974
      -0.092754
      True
      False
      False
      NaT
      2010-07-04 11:21:00
      Park Street
      2
    
    
      2
      BikePoints_100
      Albert Embankment, Vauxhall
      001059
      BikePoint
      51.490435
      -0.122806
      True
      False
      False
      NaT
      2010-07-14 09:31:00
      Albert Embankment
      2
    
    
      3
      BikePoints_101
      Queen Street 1, Bank
      000999
      BikePoint
      51.511553
      -0.092940
      True
      False
      False
      NaT
      2010-07-14 10:18:00
      Queen Street
      1
    
    
      4
      BikePoints_102
      Jewry Street, Aldgate
      001045
      BikePoint
      51.513406
      -0.076793
      True
      False
      False
      NaT
      2010-07-14 10:21:00
      Jewry Street
      2



In [70]:

    
stations_dataset.describe()



In [71]:

    
stations_dataset.info(memory_usage='deep')









    



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 779 entries, 0 to 778
Data columns (total 13 columns):
Id              779 non-null object
Name            779 non-null object
TerminalName    779 non-null object
PlaceType       779 non-null object
Latitude        779 non-null float64
Longitude       779 non-null float64
Installed       779 non-null bool
Temporary       779 non-null bool
Locked          779 non-null bool
RemovalDate     3 non-null datetime64[ns]
InstallDate     685 non-null datetime64[ns]
ShortName       779 non-null object
Priority        733 non-null object
dtypes: bool(3), datetime64[ns](2), float64(2), object(6)
memory usage: 504.7 KB



In [72]:

    
pickle.dump(stations_dataset, open("data/parsed/stations_dataset_final.p", "wb"))

	Latitude	Longitude
count	789.000000	789.000000
mean	51.440649	-0.128648
std	1.833769	0.056337
min	0.000000	-0.236769
25%	51.493184	-0.173029
50%	51.509301	-0.131961
75%	51.520858	-0.092762
max	51.549369	0.122299

	Latitude	Longitude
count	779.000000	779.000000
mean	51.505980	-0.129346
std	0.019976	0.055562
min	51.454752	-0.236769
25%	51.493235	-0.173685
50%	51.509303	-0.132102
75%	51.520849	-0.092940
max	51.549369	-0.002275

	Id	InstallDate	Installed	Latitude	Locked	Longitude	Name	NbBikes	NbDocks	NbEmptyDocks	PlaceType	Temporary	TerminalName	Timestamp
0	BikePoints_1	1278947280000	true	51.529163	false	-0.109970	River Street , Clerkenwell	11	19	7	BikePoint	false	001023	2016-05-16T06:26:24.037
1	BikePoints_2	1278585780000	true	51.499606	false	-0.197574	Phillimore Gardens, Kensington	12	37	25	BikePoint	false	001018	2016-05-16T06:26:24.037
2	BikePoints_3	1278240360000	true	51.521283	false	-0.084605	Christopher Street, Liverpool Street	6	32	26	BikePoint	false	001012	2016-05-16T06:51:27.5
3	BikePoints_4	1278241080000	true	51.530059	false	-0.120973	St. Chad's Street, King's Cross	14	23	9	BikePoint	false	001013	2016-05-16T06:51:27.5
4	BikePoints_5	1278241440000	true	51.493130	false	-0.156876	Sedding Street, Sloane Square	27	27	0	BikePoint	false	003420	2016-05-16T06:46:27.237

	Id	Name	TerminalName	PlaceType	Latitude	Longitude	Installed	Temporary	Locked	RemovalDate	InstallDate	ShortName	Priority
0	BikePoints_1	River Street, Clerkenwell	001023	BikePoint	51.529163	-0.109970	True	False	False	NaT	2010-07-12 15:08:00	River Street	2
1	BikePoints_10	Park Street, Bankside	001024	BikePoint	51.505974	-0.092754	True	False	False	NaT	2010-07-04 11:21:00	Park Street	2
2	BikePoints_100	Albert Embankment, Vauxhall	001059	BikePoint	51.490435	-0.122806	True	False	False	NaT	2010-07-14 09:31:00	Albert Embankment	2
3	BikePoints_101	Queen Street 1, Bank	000999	BikePoint	51.511553	-0.092940	True	False	False	NaT	2010-07-14 10:18:00	Queen Street	1
4	BikePoints_102	Jewry Street, Aldgate	001045	BikePoint	51.513406	-0.076793	True	False	False	NaT	2010-07-14 10:21:00	Jewry Street	2
5	BikePoints_103	Vicarage Gate, Kensington	003441	BikePoint	51.504723	-0.192538	True	False	False	NaT	2010-07-14 10:32:00	Vicarage Gate	2
6	BikePoints_104	Crosswall, Tower	000991	BikePoint	51.511594	-0.077121	True	False	False	NaT	2010-07-14 10:36:00	Crosswall	1
7	BikePoints_105	Westbourne Grove, Bayswater	001041	BikePoint	51.515529	-0.190240	True	False	False	NaT	2010-07-14 11:02:00	Westbourne Grove	2
8	BikePoints_106	Woodstock Street, Mayfair	001042	BikePoint	51.514105	-0.147301	True	False	False	NaT	2010-07-14 11:28:00	Woodstock Street	2
9	BikePoints_107	Finsbury Leisure Centre, St. Lukes	001049	BikePoint	51.526008	-0.096317	True	False	False	NaT	2010-07-14 11:38:00	Finsbury Leisure Centre	2
10	BikePoints_108	Abbey Orchard Street, Westminster	003429	BikePoint	51.498125	-0.132102	True	False	False	NaT	2010-07-14 11:42:00	Abbey Orchard Street	1
11	BikePoints_109	Soho Square, Soho	001052	BikePoint	51.515631	-0.132328	True	False	False	NaT	2010-07-14 11:52:00	Soho Square	1
12	BikePoints_11	Brunswick Square, Bloomsbury	001022	BikePoint	51.523951	-0.122502	True	False	False	NaT	2010-07-05 14:34:00	Brunswick Square	2
13	BikePoints_110	Wellington Road, St. Johns Wood	001055	BikePoint	51.533043	-0.172528	True	False	False	NaT	2010-07-14 12:02:00	Wellington Road	2
14	BikePoints_111	Park Lane, Hyde Park	001037	BikePoint	51.510017	-0.157275	True	False	False	NaT	2010-07-14 12:06:00	Park Lane	2
15	BikePoints_112	Stonecutter Street, Holborn	001061	BikePoint	51.515809	-0.105270	True	False	False	NaT	2010-07-14 13:51:00	Stonecutter Street	2
16	BikePoints_113	Gloucester Road (Central), South Kensington	003435	BikePoint	51.496462	-0.183289	True	False	False	NaT	2010-07-14 14:10:00	Gloucester Road (Central)	2
17	BikePoints_114	Park Road (Baker Street), The Regents Park	001050	BikePoint	51.524517	-0.158963	True	False	False	NaT	2010-07-14 15:05:00	Park Road (Baker Street)	2
18	BikePoints_115	Braham Street, Aldgate	001062	BikePoint	51.514233	-0.073537	True	False	False	NaT	2010-07-14 15:18:00	Braham Street	2
19	BikePoints_116	Little Argyll Street, West End	000995	BikePoint	51.514499	-0.141423	True	False	False	NaT	2010-07-14 15:46:00	Little Argyll Street	1
20	BikePoints_117	Lollard Street, Vauxhall	000998	BikePoint	51.492880	-0.114934	True	False	False	NaT	2010-07-14 16:17:00	Lollard Street	2
21	BikePoints_118	Rochester Row, Westminster	003457	BikePoint	51.495827	-0.135478	True	False	False	NaT	2010-07-14 16:22:00	Rochester Row	2
22	BikePoints_119	Bath Street, St. Lukes	000964	BikePoint	51.525893	-0.090847	True	False	False	NaT	2010-07-14 16:26:00	Bath Street	2
23	BikePoints_12	Malet Street, Bloomsbury	000980	BikePoint	51.521680	-0.130431	True	False	False	NaT	2010-07-05 14:37:00	Malet Street	1
24	BikePoints_120	The Guildhall, Guildhall	001044	BikePoint	51.515735	-0.093080	True	False	False	NaT	2010-07-15 09:44:00	The Guildhall	2
25	BikePoints_121	Baker Street, Marylebone	001086	BikePoint	51.518913	-0.156166	True	False	False	NaT	2010-07-15 10:20:00	Baker Street	2
26	BikePoints_122	Norton Folgate, Liverpool Street	001068	BikePoint	51.521113	-0.078869	True	False	False	NaT	2010-07-15 10:34:00	Norton Folgate	2
27	BikePoints_123	St. John Street, Finsbury	000992	BikePoint	51.528360	-0.104724	True	False	False	NaT	2010-07-15 10:55:00	St. John Street	2
28	BikePoints_124	Eaton Square, Belgravia	001069	BikePoint	51.496544	-0.150905	True	False	False	NaT	2010-07-15 10:59:00	Eaton Square	2
29	BikePoints_125	Borough High Street, The Borough	000996	BikePoint	51.500694	-0.094524	True	False	False	NaT	2010-07-15 11:10:00	Borough High Street	2
...	...	...	...	...	...	...	...	...	...	...	...	...	...
759	BikePoints_808	Stockwell Roundabout, Stockwell	300207	BikePoint	51.473486	-0.122555	True	False	False	NaT	NaT	Stockwell Roundabout	NaN
760	BikePoints_809	Lincolns Inn Fields, Holborn	300240	BikePoint	51.516277	-0.118272	True	False	False	NaT	NaT	Lincolns Inn Fields	NaN
761	BikePoints_81	Great Titchfield Street, Fitzrovia	003450	BikePoint	51.520253	-0.141327	True	False	False	NaT	2010-07-13 09:23:00	Great Titchfield Street	2
762	BikePoints_810	Tate Modern, Bankside	300237	BikePoint	51.506725	-0.098807	True	False	False	NaT	2016-06-03 08:40:00	Tate Modern	NaN
763	BikePoints_811	Westferry Circus, Canary Wharf	300228	BikePoint	51.505703	-0.027772	True	False	False	NaT	NaT	Westferry Circus	NaN
764	BikePoints_814	Clapham Road, Lingham Street, Stockwell	300245	BikePoint	51.471433	-0.123670	True	False	False	NaT	2016-06-02 12:21:00	Clapham Road	NaN
765	BikePoints_814	Clapham Road, Lingham Street, Stockwell	300245	BikePoint	51.471433	-0.123670	True	False	False	NaT	NaT	Clapham Road	NaN
766	BikePoints_815	Lambeth Palace Road, Waterloo	300231	BikePoint	51.500089	-0.116628	True	False	False	NaT	2016-05-04 10:28:00	Lambeth Palace Road	NaN
767	BikePoints_817	Riverlight South, Nine Elms	300232	BikePoint	51.481335	-0.138212	True	False	False	NaT	2016-06-03 10:19:00	Riverlight South	NaN
768	BikePoints_818	One Tower Bridge, Bermondsey	300249	BikePoint	51.503127	-0.078655	True	False	False	NaT	NaT	One Tower Bridge	NaN
769	BikePoints_818	One Tower Bridge, Southwark	300249	BikePoint	51.503127	-0.078655	True	False	False	NaT	NaT	One Tower Bridge	NaN
770	BikePoints_82	Chancery Lane, Holborn	003453	BikePoint	51.514274	-0.111257	True	False	False	NaT	2010-07-13 10:08:00	Chancery Lane	2
771	BikePoints_83	Panton Street, West End	003452	BikePoint	51.509639	-0.131510	True	False	False	NaT	2010-07-13 10:10:00	Panton Street	2
772	BikePoints_84	Breams Buildings, Holborn	003449	BikePoint	51.515937	-0.111778	True	False	False	NaT	2010-07-13 11:24:00	Breams Buildings	2
773	BikePoints_85	Tanner Street, Bermondsey	000994	BikePoint	51.500647	-0.078600	True	False	False	NaT	2010-07-13 13:01:00	Tanner Street	2
774	BikePoints_86	Sancroft Street, Vauxhall	003434	BikePoint	51.489479	-0.115156	True	False	False	NaT	2010-07-13 13:19:00	Sancroft Street	2
775	BikePoints_87	Devonshire Square, Liverpool Street	003438	BikePoint	51.516468	-0.079684	True	False	False	NaT	2010-07-13 13:28:00	Devonshire Square	2
776	BikePoints_88	Bayley Street, Bloomsbury	001006	BikePoint	51.518587	-0.132053	True	False	False	NaT	2010-07-13 13:38:00	Bayley Street	2
777	BikePoints_89	Tavistock Place, Bloomsbury	003439	BikePoint	51.526250	-0.123509	True	False	False	NaT	2010-07-13 13:56:00	Tavistock Place	2
778	BikePoints_9	New Globe Walk, Bankside	001015	BikePoint	51.507385	-0.096440	True	False	False	NaT	2010-07-04 11:19:00	New Globe Walk	2
779	BikePoints_90	Harrington Square 1, Camden Town	001038	BikePoint	51.533019	-0.139174	True	False	False	NaT	2010-07-13 14:07:00	Harrington Square	2
780	BikePoints_91	Walnut Tree Walk, Vauxhall	001076	BikePoint	51.493686	-0.111014	True	False	False	NaT	2010-07-13 15:59:00	Walnut Tree Walk	2
781	BikePoints_92	Borough Road, Elephant and Castle	001082	BikePoint	51.498898	-0.100440	True	False	False	NaT	2010-07-13 16:04:00	Borough Road	2
782	BikePoints_93	Cloudesley Road, Angel	002586	BikePoint	51.534408	-0.109025	True	False	False	NaT	2010-07-13 16:16:00	Cloudesley Road	2
783	BikePoints_94	Bricklayers Arms, Borough	001070	BikePoint	51.495061	-0.085814	True	False	False	NaT	2010-07-13 16:27:00	Bricklayers Arms	2
784	BikePoints_95	Aldersgate Street, Barbican	001065	BikePoint	51.520841	-0.097340	True	False	False	NaT	2010-07-14 08:36:00	Aldersgate Street	1
785	BikePoints_96	Falkirk Street, Hoxton	001047	BikePoint	51.530950	-0.078505	True	False	False	NaT	2010-07-14 08:43:00	Falkirk Street	2
786	BikePoints_97	Gloucester Road (North), Kensington	003447	BikePoint	51.497924	-0.183834	True	False	False	NaT	2010-07-14 08:53:00	Gloucester Road (North)	2
787	BikePoints_98	Hampstead Road, Euston	000972	BikePoint	51.525542	-0.138231	True	False	False	NaT	2010-07-14 09:18:00	Hampstead Road	2
788	BikePoints_99	Old Quebec Street, Marylebone	001085	BikePoint	51.514577	-0.158264	True	False	False	NaT	2010-07-14 09:28:00	Old Quebec Street	2

	Id	Name	TerminalName	PlaceType	Latitude	Longitude	Installed	Temporary	Locked	RemovalDate	InstallDate	ShortName	Priority
150	BikePoints_237	Dock Street, Wapping	003467	BikePoint	51.509786	-0.068161	True	False	False	NaT	2010-07-22 11:44:00	Dock Street	NaN
151	BikePoints_237	Dock Street, Wapping	003467	BikePoint	51.509786	-0.068161	True	False	False	NaT	2010-07-20 11:44:00	Dock Street	NaN
417	BikePoints_497	Merchant Street, Bow	200242	BikePoint	51.526535	-0.028619	True	False	False	NaT	2012-01-24 09:47:00	Merchant Street	NaN
418	BikePoints_497	Merchant Street, Bow	200242	BikePoint	51.526177	-0.027467	True	False	False	NaT	2012-01-24 09:47:00	Merchant Street	NaN
726	BikePoints_780	Imperial Wharf Station	300070	BikePoint	51.474665	-0.183165	True	False	False	NaT	2015-08-13 08:40:00	Imperial Wharf Station	2
727	BikePoints_780	Imperial Wharf Station, Sands End	300070	BikePoint	51.474665	-0.183165	True	False	False	NaT	2015-08-13 08:40:00	Imperial Wharf Station	2
743	BikePoints_796	Coram Street, Bloomsbury	300201	BikePoint	51.524000	-0.126409	True	False	True	NaT	2016-02-29 11:47:00	Coram Street	NaN
744	BikePoints_796	Coram Street, Bloomsbury	300201	BikePoint	51.524000	-0.126409	True	False	False	NaT	2016-02-29 11:47:00	Coram Street	NaN
745	BikePoints_798	Birkenhead Street, Kings Cross	300212	BikePoint	51.530199	0.122299	True	False	False	NaT	NaT	Birkenhead Street	NaN
746	BikePoints_798	Birkenhead Street, Kings Cross	300212	BikePoint	51.530199	-0.122299	True	False	False	NaT	NaT	Birkenhead Street	NaN
747	BikePoints_799	Kings Gate House, Westminster	300202	BikePoint	51.497698	-0.137598	True	False	False	NaT	NaT	Kings Gate House	NaN
748	BikePoints_799	Kings Gate House, Westminster	300202	BikePoint	51.497698	-0.137598	True	False	False	NaT	2016-06-02 14:08:00	Kings Gate House	NaN
753	BikePoints_802	Albert Square, Stockwell	300209	BikePoint	51.476590	-0.118256	True	False	False	NaT	2016-06-02 11:05:00	Albert Square	NaN
754	BikePoints_802	Albert Square, Stockwell	300209	BikePoint	51.476590	-0.118256	True	False	False	NaT	NaT	Albert Square	NaN
764	BikePoints_814	Clapham Road, Lingham Street, Stockwell	300245	BikePoint	51.471433	-0.123670	True	False	False	NaT	2016-06-02 12:21:00	Clapham Road	NaN
765	BikePoints_814	Clapham Road, Lingham Street, Stockwell	300245	BikePoint	51.471433	-0.123670	True	False	False	NaT	NaT	Clapham Road	NaN
768	BikePoints_818	One Tower Bridge, Bermondsey	300249	BikePoint	51.503127	-0.078655	True	False	False	NaT	NaT	One Tower Bridge	NaN
769	BikePoints_818	One Tower Bridge, Southwark	300249	BikePoint	51.503127	-0.078655	True	False	False	NaT	NaT	One Tower Bridge	NaN

	Id	Name	TerminalName	PlaceType	Latitude	Longitude	Installed	Temporary	Locked	RemovalDate	InstallDate	ShortName	Priority
127	BikePoints_216	Old Brompton Road, South Kensington	003479	BikePoint	51.490945	-0.181190	True	False	False	NaT	2010-07-19 11:12:00	Old Brompton Road	2
500	BikePoints_573	Limerston Street, West Chelsea	200001	BikePoint	51.485587	-0.181190	True	False	False	NaT	2012-03-15 07:21:00	Limerston Street	2
120	BikePoints_21	Hampstead Road (Cartmel), Euston	003426	BikePoint	51.530078	-0.138846	True	False	False	NaT	2010-07-06 14:49:00	Hampstead Road (Cartmel)	2
237	BikePoints_318	Sackville Street, Mayfair	001197	BikePoint	51.510048	-0.138846	True	False	False	NaT	2010-07-23 11:42:00	Sackville Street	2
10	BikePoints_108	Abbey Orchard Street, Westminster	003429	BikePoint	51.498125	-0.132102	True	False	False	NaT	2010-07-14 11:42:00	Abbey Orchard Street	1
554	BikePoints_624	Courland Grove, Wandsworth Road	200173	BikePoint	51.472918	-0.132102	True	False	False	NaT	2013-10-08 09:24:00	Courland Grove	2
3	BikePoints_101	Queen Street 1, Bank	000999	BikePoint	51.511553	-0.092940	True	False	False	NaT	2010-07-14 10:18:00	Queen Street	1
345	BikePoints_427	Cheapside, Bank	022180	BikePoint	51.513970	-0.092940	True	False	False	NaT	2011-07-15 10:28:00	Cheapside	1
10	BikePoints_108	Abbey Orchard Street, Westminster	003429	BikePoint	51.498125	-0.132102	True	False	False	NaT	2010-07-14 11:42:00	Abbey Orchard Street	1
393	BikePoints_474	Castalia Square, Cubitt Town	200155	BikePoint	51.498125	-0.011457	True	False	False	NaT	2012-01-17 17:56:00	Castalia Square	2
386	BikePoints_468	Cantrell Road, Bow	200150	BikePoint	51.521564	-0.022694	True	False	False	NaT	2012-01-12 13:42:00	Cantrell Road	2
476	BikePoints_550	Harford Street, Mile End	200102	BikePoint	51.521564	-0.039264	True	False	False	NaT	NaT	Harford Street	2
247	BikePoints_327	New North Road 1, Hoxton	001128	BikePoint	51.530950	-0.085603	True	False	False	NaT	2010-07-26 08:37:00	New North Road	2
785	BikePoints_96	Falkirk Street, Hoxton	001047	BikePoint	51.530950	-0.078505	True	False	False	NaT	2010-07-14 08:43:00	Falkirk Street	2

	Id	Timestamp	NbBikes	NbDocks	NbEmptyDocks
750	BikePoints_791	2016-05-10 15:34:07.137000+00:00	0	0	0
608	BikePoints_646	2016-05-14 20:36:22.417000+00:00	0	0	0
570	BikePoints_608	2016-05-14 23:18:18.467000+00:00	14	29	15
666	BikePoints_704	2016-05-15 00:50:38.140000+00:00	9	18	9
634	BikePoints_672	2016-05-15 04:11:04.447000+00:00	28	28	0

	NbBikes	NbDocks	NbEmptyDocks	NbUnusableDocks
count	1.529937e+06	1.529937e+06	1.529937e+06	1.529937e+06
mean	1.239987e+01	2.701473e+01	1.404461e+01	5.702444e-01
std	9.076213e+00	9.518489e+00	9.577277e+00	9.018525e-01
min	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00
25%	5.000000e+00	2.000000e+01	7.000000e+00	0.000000e+00
50%	1.100000e+01	2.500000e+01	1.300000e+01	0.000000e+00
75%	1.800000e+01	3.300000e+01	1.900000e+01	1.000000e+00
max	6.400000e+01	6.400000e+01	6.400000e+01	3.800000e+01

	Id	MissingDay	Weekday
0	BikePoints_109	2016-06-25	5
1	BikePoints_112	2016-05-25	2
53	BikePoints_112	2016-05-26	3
54	BikePoints_120	2016-06-10	4
2	BikePoints_120	2016-06-11	5
3	BikePoints_129	2016-06-25	5
55	BikePoints_133	2016-06-24	4
4	BikePoints_133	2016-06-25	5
91	BikePoints_133	2016-06-26	6
5	BikePoints_153	2016-06-17	4
258	BikePoints_153	2016-06-18	5
209	BikePoints_153	2016-06-19	6
155	BikePoints_153	2016-06-20	0
92	BikePoints_153	2016-06-21	1
281	BikePoints_153	2016-06-22	2
234	BikePoints_153	2016-06-23	3
183	BikePoints_153	2016-06-24	4
124	BikePoints_153	2016-06-25	5
56	BikePoints_153	2016-06-26	6
6	BikePoints_184	2016-06-25	5
7	BikePoints_192	2016-06-25	5
8	BikePoints_218	2016-06-04	5
9	BikePoints_226	2016-05-15	6
57	BikePoints_226	2016-05-16	0
490	BikePoints_237	2016-05-15	6
514	BikePoints_237	2016-05-16	0
465	BikePoints_237	2016-05-17	1
391	BikePoints_237	2016-05-18	2
259	BikePoints_237	2016-05-19	3
10	BikePoints_237	2016-05-20	4
...	...	...	...
437	BikePoints_817	2016-06-17	4
595	BikePoints_817	2016-06-18	5
181	BikePoints_817	2016-06-19	6
120	BikePoints_817	2016-06-20	0
322	BikePoints_817	2016-06-21	1
521	BikePoints_817	2016-06-22	2
577	BikePoints_817	2016-06-23	3
497	BikePoints_817	2016-06-24	4
87	BikePoints_817	2016-06-25	5
301	BikePoints_817	2016-06-26	6
50	BikePoints_818	2016-05-15	6
88	BikePoints_818	2016-05-17	1
121	BikePoints_818	2016-05-18	2
182	BikePoints_86	2016-05-28	5
280	BikePoints_86	2016-05-29	6
342	BikePoints_86	2016-05-30	0
122	BikePoints_86	2016-05-31	1
257	BikePoints_86	2016-06-01	2
208	BikePoints_86	2016-06-02	3
89	BikePoints_86	2016-06-03	4
359	BikePoints_86	2016-06-04	5
302	BikePoints_86	2016-06-05	6
233	BikePoints_86	2016-06-06	0
153	BikePoints_86	2016-06-07	1
51	BikePoints_86	2016-06-08	2
323	BikePoints_86	2016-06-09	3
52	BikePoints_9	2016-06-09	3
123	BikePoints_9	2016-06-10	4
90	BikePoints_9	2016-06-11	5
154	BikePoints_9	2016-06-12	6

	NbBikes	NbDocks	NbEmptyDocks	NbUnusableDocks
count	1.500921e+06	1.500921e+06	1.500921e+06	1.500921e+06
mean	1.240701e+01	2.701745e+01	1.403816e+01	5.722786e-01
std	9.087195e+00	9.518367e+00	9.583962e+00	9.030175e-01
min	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00
25%	5.000000e+00	2.000000e+01	7.000000e+00	0.000000e+00
50%	1.100000e+01	2.500000e+01	1.300000e+01	0.000000e+00
75%	1.800000e+01	3.300000e+01	1.900000e+01	1.000000e+00
max	6.400000e+01	6.400000e+01	6.400000e+01	3.800000e+01