Homework 1 - Data Analysis and Regression

In this assignment your challenge is to do some basic analysis for Airbnb. Provided in hw/data/ there are 2 data files, bookings.csv and listings.csv. The objective is to practice data munging and begin our exploration of regression.



In [1]:

    
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib inline



In [2]:

    
bookings=pd.read_csv('../data/bookings.csv')



In [3]:

    
bookings.info()
bookings.head(5)









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 6076 entries, 0 to 6075
Data columns (total 2 columns):
prop_id         6076 non-null int64
booking_date    6076 non-null object
dtypes: int64(1), object(1)
memory usage: 142.4+ KB






    Out[3]:






  
    
      
      prop_id
      booking_date
    
  
  
    
      0
        9
       2011-06-17
    
    
      1
       13
       2011-08-12
    
    
      2
       21
       2011-06-20
    
    
      3
       28
       2011-05-05
    
    
      4
       29
       2011-11-17

Part 1 - Data exploration

First, create 2 data frames: `listings` and `bookings` from their respective data files



In [4]:

    
listings=pd.read_csv('../data/listings.csv')



In [5]:

    
listings.info()
listings.head(5)









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 408 entries, 0 to 407
Data columns (total 8 columns):
prop_id               408 non-null int64
prop_type             408 non-null object
neighborhood          408 non-null object
price                 408 non-null int64
person_capacity       408 non-null int64
picture_count         408 non-null int64
description_length    408 non-null int64
tenure_months         408 non-null int64
dtypes: int64(6), object(2)
memory usage: 28.7+ KB






    Out[5]:






  
    
      
      prop_id
      prop_type
      neighborhood
      price
      person_capacity
      picture_count
      description_length
      tenure_months
    
  
  
    
      0
       1
       Property type 1
       Neighborhood 14
       140
       3
       11
       232
       30
    
    
      1
       2
       Property type 1
       Neighborhood 14
        95
       2
        3
        37
       29
    
    
      2
       3
       Property type 2
       Neighborhood 16
        95
       2
       16
       172
       29
    
    
      3
       4
       Property type 2
       Neighborhood 13
        90
       2
       19
       472
       28
    
    
      4
       5
       Property type 1
       Neighborhood 15
       125
       5
       21
       442
       28

What is the mean, median and standard deviation of price, person capacity, picture count, description length and tenure of the properties?



In [6]:

    
listings.price.mean()









    Out[6]:





187.80637254901961



In [7]:

    
listings.mean(axis=0)









    Out[7]:





prop_id               204.500000
price                 187.806373
person_capacity         2.997549
picture_count          14.389706
description_length    309.159314
tenure_months           8.487745
dtype: float64



In [8]:

    
listings.median(axis=0)









    Out[8]:





prop_id               204.5
price                 125.0
person_capacity         2.0
picture_count          12.0
description_length    250.0
tenure_months           7.0
dtype: float64



In [9]:

    
listings.std(axis=0)









    Out[9]:





prop_id               117.923704
price                 353.050858
person_capacity         1.594676
picture_count          10.477428
description_length    228.021684
tenure_months           5.872088
dtype: float64

What what are the mean price, person capacity, picture count, description length and tenure of the properties grouped by property type?



In [10]:

    
listings.groupby(['prop_type'])['price','person_capacity','picture_count','description_length','tenure_months'].mean()









    Out[10]:






  
    
      
      price
      person_capacity
      picture_count
      description_length
      tenure_months
    
    
      prop_type
      
      
      
      
      
    
  
  
    
      Property type 1
       237.085502
       3.516729
       14.695167
       313.171004
        8.464684
    
    
      Property type 2
        93.288889
       2.000000
       13.948148
       304.851852
        8.377778
    
    
      Property type 3
        63.750000
       1.750000
        8.750000
       184.750000
       13.750000

Same, but by property type per neighborhood?



In [11]:

    
listings.groupby(['neighborhood','prop_type'])['price','person_capacity','picture_count','description_length','tenure_months'].mean()









    Out[11]:






  
    
      
      
      price
      person_capacity
      picture_count
      description_length
      tenure_months
    
    
      neighborhood
      prop_type
      
      
      
      
      
    
  
  
    
      Neighborhood 1
      Property type 1
        85.000000
       2.000000
       26.000000
       209.000000
        6.000000
    
    
      Neighborhood 10
      Property type 1
       142.500000
       3.500000
       13.333333
       391.000000
        3.833333
    
    
      Property type 2
       137.500000
       2.000000
       20.000000
       126.000000
        3.500000
    
    
      Neighborhood 11
      Property type 1
       159.428571
       3.214286
        9.928571
       379.000000
        9.642857
    
    
      Property type 2
        78.750000
       2.000000
       16.750000
       161.250000
       11.250000
    
    
      Property type 3
        75.000000
       2.000000
       15.000000
       196.000000
        8.000000
    
    
      Neighborhood 12
      Property type 1
       365.615385
       3.435897
       10.820513
       267.205128
        7.897436
    
    
      Property type 2
        96.894737
       1.947368
       10.473684
       244.526316
        9.842105
    
    
      Neighborhood 13
      Property type 1
       241.897959
       4.061224
       15.653061
       290.408163
        9.122449
    
    
      Property type 2
        81.130435
       1.826087
       16.695652
       418.565217
        9.739130
    
    
      Neighborhood 14
      Property type 1
       164.676471
       3.205882
       14.764706
       317.205882
        8.441176
    
    
      Property type 2
        83.809524
       1.857143
       15.904762
       348.619048
        8.714286
    
    
      Property type 3
        75.000000
       1.000000
        1.000000
       113.000000
        5.000000
    
    
      Neighborhood 15
      Property type 1
       178.880000
       3.720000
       14.320000
       321.760000
        9.320000
    
    
      Property type 2
        95.000000
       2.266667
       11.733333
       301.733333
        8.200000
    
    
      Neighborhood 16
      Property type 1
       158.928571
       2.928571
       21.642857
       310.714286
        7.071429
    
    
      Property type 2
        83.625000
       2.062500
       15.375000
       246.250000
        6.687500
    
    
      Neighborhood 17
      Property type 1
       189.869565
       3.521739
       16.086957
       317.347826
        9.869565
    
    
      Property type 2
       102.454545
       2.000000
       15.454545
       308.272727
        7.181818
    
    
      Property type 3
        65.000000
       2.000000
       15.000000
       189.000000
       23.000000
    
    
      Neighborhood 18
      Property type 1
       173.590909
       2.954545
       16.090909
       369.227273
        8.227273
    
    
      Property type 2
       120.666667
       2.222222
       12.333333
       297.777778
        9.222222
    
    
      Neighborhood 19
      Property type 1
       222.375000
       3.625000
       11.000000
       254.500000
        6.500000
    
    
      Property type 2
        88.875000
       2.000000
       15.125000
       383.375000
        5.500000
    
    
      Neighborhood 2
      Property type 1
       250.000000
       6.000000
        8.000000
       423.000000
        6.000000
    
    
      Neighborhood 20
      Property type 1
       804.333333
       2.777778
        9.444444
       223.555556
        9.666667
    
    
      Property type 2
        60.000000
       1.000000
        3.000000
       101.000000
        6.000000
    
    
      Neighborhood 21
      Property type 1
       362.500000
       4.250000
       49.000000
       306.250000
       14.750000
    
    
      Neighborhood 22
      Property type 1
       225.000000
       3.000000
       19.000000
       500.000000
        9.000000
    
    
      Neighborhood 3
      Property type 2
        60.000000
       2.000000
        7.000000
       264.000000
        9.000000
    
    
      Neighborhood 4
      Property type 2
        60.000000
       2.000000
       10.000000
        95.000000
       11.000000
    
    
      Property type 3
        40.000000
       2.000000
        4.000000
       241.000000
       19.000000
    
    
      Neighborhood 5
      Property type 1
       194.500000
       2.500000
        8.500000
       266.500000
       11.500000
    
    
      Neighborhood 6
      Property type 1
       146.000000
       3.333333
       12.666667
       290.666667
        4.000000
    
    
      Neighborhood 7
      Property type 1
       161.000000
       3.666667
       14.333333
       343.000000
        5.333333
    
    
      Property type 2
       100.000000
       2.000000
        3.000000
       148.000000
        2.000000
    
    
      Neighborhood 8
      Property type 1
       174.750000
       5.000000
       11.000000
       300.000000
        6.750000
    
    
      Property type 2
       350.000000
       4.000000
        5.000000
       223.000000
        3.000000
    
    
      Neighborhood 9
      Property type 1
       151.142857
       4.285714
       13.428571
       471.428571
        5.714286
    
    
      Property type 2
       110.000000
       2.000000
        3.500000
       114.500000
        9.000000

Plot daily bookings:



In [12]:

    
bookings.booking_date=pd.to_datetime(bookings.booking_date)
print dir(bookings.booking_date[0])









    



['__add__', '__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__pyx_vtable__', '__qualname__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__', '__rsub__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__weakref__', '_date_repr', '_get_field', '_get_start_end_field', '_has_time_component', '_repr_base', '_time_repr', 'asm8', 'astimezone', 'combine', 'ctime', 'date', 'day', 'dayofweek', 'dayofyear', 'dst', 'freq', 'freqstr', 'fromordinal', 'fromtimestamp', 'hour', 'is_month_end', 'is_month_start', 'is_quarter_end', 'is_quarter_start', 'is_year_end', 'is_year_start', 'isocalendar', 'isoformat', 'isoweekday', 'max', 'microsecond', 'min', 'minute', 'month', 'nanosecond', 'now', 'offset', 'quarter', 'replace', 'resolution', 'second', 'strftime', 'strptime', 'time', 'timetuple', 'timetz', 'to_datetime', 'to_julian_date', 'to_period', 'to_pydatetime', 'today', 'toordinal', 'tz', 'tz_convert', 'tz_localize', 'tzinfo', 'tzname', 'utcfromtimestamp', 'utcnow', 'utcoffset', 'utctimetuple', 'value', 'week', 'weekday', 'weekofyear', 'year']



In [13]:

    
bookings.booking_date.value_counts().plot()









    Out[13]:





<matplotlib.axes._subplots.AxesSubplot at 0x109e5b9d0>



In [14]:

    
bookings.info()
bookings.booking_date.head(5)









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 6076 entries, 0 to 6075
Data columns (total 2 columns):
prop_id         6076 non-null int64
booking_date    6076 non-null datetime64[ns]
dtypes: datetime64[ns](1), int64(1)
memory usage: 142.4 KB






    Out[14]:





0   2011-06-17
1   2011-08-12
2   2011-06-20
3   2011-05-05
4   2011-11-17
Name: booking_date, dtype: datetime64[ns]

Plot the daily bookings per neighborhood (provide a legend)



In [15]:

    
listMerge = listings.merge(bookings, on='prop_id')
listMerge.groupby(['neighborhood','booking_date'])['prop_id'].agg(['count']).unstack(0).plot()









    Out[15]:





<matplotlib.axes._subplots.AxesSubplot at 0x109f71a10>

Part 2 - Develop a data set



In [17]:

    
listings.columns









    Out[17]:





Index([u'prop_id', u'prop_type', u'neighborhood', u'price', u'person_capacity', u'picture_count', u'description_length', u'tenure_months'], dtype='object')

Add the columns `number_of_bookings` and `booking_rate` (number_of_bookings/tenure_months) to your `listings` data frame



In [18]:

    
bookings.head(5)









    Out[18]:






  
    
      
      prop_id
      booking_date
    
  
  
    
      0
        9
      2011-06-17
    
    
      1
       13
      2011-08-12
    
    
      2
       21
      2011-06-20
    
    
      3
       28
      2011-05-05
    
    
      4
       29
      2011-11-17



In [19]:

    
book_by_prop=bookings.groupby('prop_id')[['prop_id']].count()
book_by_prop.head()



In [20]:

    
book_by_prop.rename(columns={'prop_id':'number_of_bookings'}, inplace=True)



In [21]:

    
book_by_prop.reset_index(inplace=True)



In [22]:

    
book_by_prop.info()
book_by_prop.head(10)









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 328 entries, 0 to 327
Data columns (total 2 columns):
prop_id               328 non-null int64
number_of_bookings    328 non-null int64
dtypes: int64(2)
memory usage: 7.7 KB






    Out[22]:






  
    
      
      prop_id
      number_of_bookings
    
  
  
    
      0
        1
        4
    
    
      1
        3
        1
    
    
      2
        4
       27
    
    
      3
        6
       88
    
    
      4
        7
        2
    
    
      5
        8
       28
    
    
      6
        9
        3
    
    
      7
       10
       26
    
    
      8
       11
        7
    
    
      9
       12
       45



In [23]:

    
listings=listings.merge(book_by_prop, on='prop_id', how='left')



In [24]:

    
listings.fillna(0.0, inplace=True)



In [25]:

    
listings.head(10)









    Out[25]:






  
    
      
      prop_id
      prop_type
      neighborhood
      price
      person_capacity
      picture_count
      description_length
      tenure_months
      number_of_bookings
    
  
  
    
      0
        1
       Property type 1
       Neighborhood 14
       140
       3
       11
       232
       30
        4
    
    
      1
        2
       Property type 1
       Neighborhood 14
        95
       2
        3
        37
       29
        0
    
    
      2
        3
       Property type 2
       Neighborhood 16
        95
       2
       16
       172
       29
        1
    
    
      3
        4
       Property type 2
       Neighborhood 13
        90
       2
       19
       472
       28
       27
    
    
      4
        5
       Property type 1
       Neighborhood 15
       125
       5
       21
       442
       28
        0
    
    
      5
        6
       Property type 2
       Neighborhood 13
        89
       2
       10
       886
       28
       88
    
    
      6
        7
       Property type 2
       Neighborhood 13
        85
       1
       11
        58
       24
        2
    
    
      7
        8
       Property type 1
       Neighborhood 18
       120
       2
       13
       685
       23
       28
    
    
      8
        9
       Property type 1
       Neighborhood 13
       210
       6
       27
       180
       23
        3
    
    
      9
       10
       Property type 3
       Neighborhood 17
        65
       2
       15
       189
       23
       26



In [26]:

    
listings.info()









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 408 entries, 0 to 407
Data columns (total 9 columns):
prop_id               408 non-null int64
prop_type             408 non-null object
neighborhood          408 non-null object
price                 408 non-null int64
person_capacity       408 non-null int64
picture_count         408 non-null int64
description_length    408 non-null int64
tenure_months         408 non-null int64
number_of_bookings    408 non-null float64
dtypes: float64(1), int64(6), object(2)
memory usage: 31.9+ KB



In [27]:

    
listings['booking_rate']=listings.number_of_bookings/listings.tenure_months

We only want to analyze well established properties, so let's filter out any properties that have a tenure less than 10 months



In [28]:

    
listings=listings[listings.tenure_months>10]

`prop_type` and `neighborhood` are categorical variables, use `get_dummies()` (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.reshape.get_dummies.html) to transform this column of categorical data to many columns of boolean values (after applying this function correctly there should be 1 column for every prop_type and 1 column for every neighborhood category.



In [29]:

    
pd.core.reshape.get_dummies(listings.prop_type)









    Out[29]:






  
    
      
      Property type 1
      Property type 2
      Property type 3
    
  
  
    
      0  
       1
       0
       0
    
    
      1  
       1
       0
       0
    
    
      2  
       0
       1
       0
    
    
      3  
       0
       1
       0
    
    
      4  
       1
       0
       0
    
    
      5  
       0
       1
       0
    
    
      6  
       0
       1
       0
    
    
      7  
       1
       0
       0
    
    
      8  
       1
       0
       0
    
    
      9  
       0
       0
       1
    
    
      10 
       1
       0
       0
    
    
      11 
       0
       1
       0
    
    
      12 
       0
       1
       0
    
    
      13 
       0
       1
       0
    
    
      14 
       0
       1
       0
    
    
      15 
       0
       1
       0
    
    
      16 
       0
       1
       0
    
    
      17 
       0
       1
       0
    
    
      18 
       0
       1
       0
    
    
      19 
       0
       0
       1
    
    
      20 
       1
       0
       0
    
    
      21 
       1
       0
       0
    
    
      22 
       0
       1
       0
    
    
      23 
       1
       0
       0
    
    
      24 
       1
       0
       0
    
    
      25 
       1
       0
       0
    
    
      26 
       1
       0
       0
    
    
      27 
       0
       1
       0
    
    
      28 
       1
       0
       0
    
    
      29 
       0
       1
       0
    
    
      ...
      ...
      ...
      ...
    
    
      90 
       1
       0
       0
    
    
      91 
       0
       1
       0
    
    
      92 
       0
       1
       0
    
    
      93 
       1
       0
       0
    
    
      94 
       1
       0
       0
    
    
      95 
       1
       0
       0
    
    
      96 
       1
       0
       0
    
    
      97 
       1
       0
       0
    
    
      98 
       1
       0
       0
    
    
      99 
       0
       1
       0
    
    
      100
       1
       0
       0
    
    
      101
       0
       1
       0
    
    
      102
       1
       0
       0
    
    
      103
       1
       0
       0
    
    
      104
       1
       0
       0
    
    
      105
       1
       0
       0
    
    
      106
       1
       0
       0
    
    
      107
       1
       0
       0
    
    
      108
       0
       1
       0
    
    
      109
       1
       0
       0
    
    
      110
       0
       1
       0
    
    
      111
       1
       0
       0
    
    
      112
       0
       1
       0
    
    
      113
       1
       0
       0
    
    
      114
       1
       0
       0
    
    
      115
       1
       0
       0
    
    
      116
       0
       1
       0
    
    
      117
       0
       1
       0
    
    
      118
       0
       1
       0
    
    
      119
       0
       1
       0
    
  

120 rows × 3 columns



In [30]:

    
pd.core.reshape.get_dummies(listings.neighborhood)









    Out[30]:






  
    
      
      Neighborhood 11
      Neighborhood 12
      Neighborhood 13
      Neighborhood 14
      Neighborhood 15
      Neighborhood 16
      Neighborhood 17
      Neighborhood 18
      Neighborhood 19
      Neighborhood 20
      Neighborhood 21
      Neighborhood 4
      Neighborhood 5
      Neighborhood 9
    
  
  
    
      0  
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      1  
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      2  
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      3  
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      4  
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      5  
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      6  
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      7  
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
    
    
      8  
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      9  
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      10 
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      11 
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      12 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      13 
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      14 
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      15 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      16 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      17 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      18 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      19 
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
    
    
      20 
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      21 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      22 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      23 
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      24 
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
    
    
      25 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      26 
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      27 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      28 
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      29 
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      90 
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      91 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      92 
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      93 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      94 
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      95 
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      96 
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
    
    
      97 
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      98 
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      99 
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      100
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      101
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      102
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      103
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
    
    
      104
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
    
    
      105
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
    
    
      106
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      107
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      108
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      109
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
    
    
      110
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      111
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      112
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      113
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      114
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      115
       0
       0
       0
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
    
    
      116
       0
       0
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      117
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
       0
       0
    
    
      118
       0
       1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
    
    
      119
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
       1
    
  

120 rows × 14 columns

create test and training sets for your regressors and predictors

predictor (y) is booking_rate, regressors (X) are everything else, except prop_id,booking_rate,prop_type,neighborhood and number_of_bookings
http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.train_test_split.html
http://pandas.pydata.org/pandas-docs/stable/basics.html#dropping-labels-from-an-axis



In [31]:

    
listings.booking_rate.hist()









    Out[31]:





<matplotlib.axes._subplots.AxesSubplot at 0x10a5c8710>



In [33]:

    
np.log(listings.booking_rate)









    Out[33]:





0    -2.014903
1         -inf
2    -3.367296
3    -0.036368
4         -inf
5     1.145132
6    -2.484907
7     0.196710
8    -2.036882
9     0.122602
10   -1.145132
11    0.715620
12    1.047319
13    0.438255
14   -0.459532
...
105    0.931558
106   -0.087011
107   -0.287682
108    0.949081
109   -1.791759
110   -1.386294
111    1.126011
112    0.693147
113   -1.299283
114   -0.318454
115        -inf
116   -0.318454
117    0.000000
118   -2.397895
119        -inf
Name: booking_rate, Length: 120, dtype: float64



In [34]:

    
from sklearn.cross_validation import train_test_split
feature_cols = ['price', 'tenure_months','person_capacity','description_length','picture_count']
a, b = listings[feature_cols], listings.booking_rate
a_train, a_test, b_train, b_test=train_test_split(a,b, test_size=0.33)



In [35]:

    
listings.info()
a_train









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 120 entries, 0 to 119
Data columns (total 10 columns):
prop_id               120 non-null int64
prop_type             120 non-null object
neighborhood          120 non-null object
price                 120 non-null int64
person_capacity       120 non-null int64
picture_count         120 non-null int64
description_length    120 non-null int64
tenure_months         120 non-null int64
number_of_bookings    120 non-null float64
booking_rate          120 non-null float64
dtypes: float64(2), int64(6), object(2)
memory usage: 10.3+ KB






    Out[35]:





array([[125,  16,   3, 327,  31],
       [ 65,  19,   2, 333,   7],
       [ 40,  19,   2, 241,   4],
       [150,  13,   2,  83,   6],
       [215,  17,   2, 190,  16],
       [233,  16,   8, 404,  12],
       [285,  16,   5, 241,   6],
       [200,  15,   4, 241,   7],
       [100,  11,   1, 212,   4],
       [326,  16,   6, 301,   6],
       [410,  16,   8, 305,  18],
       [125,  11,   2,  89,   9],
       [250,  16,   3, 289,   6],
       [350,  16,   4, 102,   5],
       [175,  16,   2, 255,   6],
       [149,  15,   3, 264,  10],
       [130,  16,   2, 240,   6],
       [230,  16,   3, 283,   6],
       [320,  16,   6, 341,   6],
       [180,  17,   6, 325,  26],
       [296,  16,   3, 283,   6],
       [ 89,  22,   2, 153,  11],
       [234,  16,   7, 361,  12],
       [ 59,  15,   1, 205,  10],
       [115,  14,   2, 340,   4],
       [294,  16,   6, 287,   5],
       [ 95,  15,   2, 308,  26],
       [200,  16,   2, 288,   6],
       [120,  23,   2, 685,  13],
       [295,  18,   5, 228,  22],
       [125,  15,   2, 208,  21],
       [ 80,  12,   2, 491,  20],
       [195,  14,   3, 214,  19],
       [ 95,  16,   2, 137,   8],
       [287,  16,   8, 309,  11],
       [164,  16,   2, 351,  11],
       [350,  12,   3, 135,   6],
       [ 49,  11,   2, 417,  14],
       [275,  15,   8, 457,  19],
       [264,  16,   4, 576,  12],
       [229,  16,   4, 281,   6],
       [135,  12,   2, 297,   8],
       [ 90,  28,   2, 472,  19],
       [ 95,  19,   2, 334,  39],
       [229,  16,   4, 388,   5],
       [105,  14,   4, 210,  15],
       [ 95,  15,   3, 315,   6],
       [145,  22,   3, 140,   9],
       [ 95,  29,   2, 172,  16],
       [ 90,  18,   3, 411,  15],
       [246,  16,   8, 437,  10],
       [139,  17,   2, 395,  20],
       [170,  16,   4, 404,  39],
       [ 75,  15,   1, 197,   7],
       [300,  15,   4, 116,  63],
       [300,  15,   4,  84,  22],
       [350,  15,   4, 129,  71],
       [ 70,  13,   1, 508,   4],
       [199,  16,   2, 199,   5],
       [150,  13,   2, 425,  26],
       [500,  16,   4, 342,   6],
       [250,  16,   6, 243,   6],
       [ 40,  13,   2, 321,   9],
       [125,  14,   2,  51,   1],
       [145,  14,   6, 297,  25],
       [250,  16,   3, 279,   6],
       [293,  16,   6, 246,  11],
       [210,  23,   6, 180,  27],
       [ 55,  11,   2, 333,   8],
       [229,  16,   4, 388,   6],
       [ 95,  13,   3, 248,  11],
       [ 96,  20,   2, 245,  10],
       [129,  12,   2, 759,  39],
       [100,  19,   2,  39,   5],
       [ 65,  23,   2, 189,  15],
       [160,  12,   6, 304,  23],
       [180,  11,   3, 256,  18],
       [199,  16,   3, 266,   6],
       [ 95,  19,   2, 255,  15],
       [125,  28,   5, 442,  21]])



In [36]:

    
b_train.reshape(-1,1)









    Out[36]:





array([[ 0.125     ],
       [ 0.36842105],
       [ 0.47368421],
       [ 0.        ],
       [ 0.17647059],
       [ 0.        ],
       [ 0.25      ],
       [ 0.        ],
       [ 0.        ],
       [ 0.5       ],
       [ 0.        ],
       [ 0.        ],
       [ 0.0625    ],
       [ 0.        ],
       [ 0.125     ],
       [ 0.13333333],
       [ 0.4375    ],
       [ 0.0625    ],
       [ 0.1875    ],
       [ 4.        ],
       [ 0.125     ],
       [ 2.04545455],
       [ 0.        ],
       [ 3.33333333],
       [ 0.        ],
       [ 0.125     ],
       [ 1.86666667],
       [ 0.125     ],
       [ 1.2173913 ],
       [ 0.5       ],
       [ 1.06666667],
       [ 2.58333333],
       [ 0.85714286],
       [ 0.5       ],
       [ 0.        ],
       [ 0.        ],
       [ 0.16666667],
       [ 0.72727273],
       [ 0.13333333],
       [ 0.        ],
       [ 0.1875    ],
       [ 0.75      ],
       [ 0.96428571],
       [ 2.31578947],
       [ 0.25      ],
       [ 0.5       ],
       [ 0.        ],
       [ 0.31818182],
       [ 0.03448276],
       [ 2.88888889],
       [ 0.0625    ],
       [ 0.11764706],
       [ 0.0625    ],
       [ 0.26666667],
       [ 0.13333333],
       [ 0.26666667],
       [ 0.33333333],
       [ 0.        ],
       [ 0.125     ],
       [ 1.30769231],
       [ 0.125     ],
       [ 0.125     ],
       [ 0.76923077],
       [ 0.        ],
       [ 0.64285714],
       [ 0.125     ],
       [ 0.        ],
       [ 0.13043478],
       [ 0.09090909],
       [ 0.0625    ],
       [ 2.53846154],
       [ 2.85      ],
       [ 3.08333333],
       [ 0.        ],
       [ 1.13043478],
       [ 0.91666667],
       [ 0.72727273],
       [ 0.125     ],
       [ 0.63157895],
       [ 0.        ]])



In [37]:

    
b_train.shape









    Out[37]:





(80,)



In [38]:

    
#need to include price, person capacity, picture count, description length, and tenure months

Part 3 - Model `booking_rate`

Create a linear regression model of your listings



In [39]:

    
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import PolynomialFeatures



In [40]:

    
#Linear Regression
clf = LinearRegression()
clf.fit(a_train, b_train)









    Out[40]:





LinearRegression(copy_X=True, fit_intercept=True, normalize=False)



In [41]:

    
b_pred = clf.predict(a_test)



In [42]:

    
a_test
print b_pred[0], a_test[0]









    



0.814508380483 [ 80  14   2 145  15]



In [43]:

    
# Let's compute sum of Errors between Actual and Predicted
# Again, more on this next week - I just want to show how these tools work together
sum_sq_model = np.sum((b_test - b_pred) ** 2)
sum_sq_model









    Out[43]:





106.60999687663644



In [44]:

    
# Compare with the base naive model where we say predicted value is just the mean value
sum_sq_naive = np.sum((b_test - b.mean()) ** 2)
sum_sq_naive









    Out[44]:





64.933464984311186

fit your model with your test sets

report the score

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.score



In [45]:

    
clf.score(a_test,b_test)









    Out[45]:





-0.6709818998178676

Interpret the results of the above model:

What does the score method do?
What does this tell us about our model?

...type here...

Optional - Iterate

Create an alternative predictor (e.g. monthly revenue) and use the same modeling pattern in Part 3 to



In [ ]:

	prop_id	booking_date
0	9	2011-06-17
1	13	2011-08-12
2	21	2011-06-20
3	28	2011-05-05
4	29	2011-11-17

	prop_id	prop_type	neighborhood	price	person_capacity	picture_count	description_length	tenure_months
0	1	Property type 1	Neighborhood 14	140	3	11	232	30
1	2	Property type 1	Neighborhood 14	95	2	3	37	29
2	3	Property type 2	Neighborhood 16	95	2	16	172	29
3	4	Property type 2	Neighborhood 13	90	2	19	472	28
4	5	Property type 1	Neighborhood 15	125	5	21	442	28

	price	person_capacity	picture_count	description_length	tenure_months
prop_type
Property type 1	237.085502	3.516729	14.695167	313.171004	8.464684
Property type 2	93.288889	2.000000	13.948148	304.851852	8.377778
Property type 3	63.750000	1.750000	8.750000	184.750000	13.750000

		price	person_capacity	picture_count	description_length	tenure_months
neighborhood	prop_type
Neighborhood 1	Property type 1	85.000000	2.000000	26.000000	209.000000	6.000000
Neighborhood 10	Property type 1	142.500000	3.500000	13.333333	391.000000	3.833333
Neighborhood 10	Property type 2	137.500000	2.000000	20.000000	126.000000	3.500000
Neighborhood 11	Property type 1	159.428571	3.214286	9.928571	379.000000	9.642857
	Property type 2	78.750000	2.000000	16.750000	161.250000	11.250000
	Property type 3	75.000000	2.000000	15.000000	196.000000	8.000000
Neighborhood 12	Property type 1	365.615385	3.435897	10.820513	267.205128	7.897436
Neighborhood 12	Property type 2	96.894737	1.947368	10.473684	244.526316	9.842105
Neighborhood 13	Property type 1	241.897959	4.061224	15.653061	290.408163	9.122449
Neighborhood 13	Property type 2	81.130435	1.826087	16.695652	418.565217	9.739130
Neighborhood 14	Property type 1	164.676471	3.205882	14.764706	317.205882	8.441176
	Property type 2	83.809524	1.857143	15.904762	348.619048	8.714286
	Property type 3	75.000000	1.000000	1.000000	113.000000	5.000000
Neighborhood 15	Property type 1	178.880000	3.720000	14.320000	321.760000	9.320000
Neighborhood 15	Property type 2	95.000000	2.266667	11.733333	301.733333	8.200000
Neighborhood 16	Property type 1	158.928571	2.928571	21.642857	310.714286	7.071429
Neighborhood 16	Property type 2	83.625000	2.062500	15.375000	246.250000	6.687500
Neighborhood 17	Property type 1	189.869565	3.521739	16.086957	317.347826	9.869565
	Property type 2	102.454545	2.000000	15.454545	308.272727	7.181818
	Property type 3	65.000000	2.000000	15.000000	189.000000	23.000000
Neighborhood 18	Property type 1	173.590909	2.954545	16.090909	369.227273	8.227273
Neighborhood 18	Property type 2	120.666667	2.222222	12.333333	297.777778	9.222222
Neighborhood 19	Property type 1	222.375000	3.625000	11.000000	254.500000	6.500000
Neighborhood 19	Property type 2	88.875000	2.000000	15.125000	383.375000	5.500000
Neighborhood 2	Property type 1	250.000000	6.000000	8.000000	423.000000	6.000000
Neighborhood 20	Property type 1	804.333333	2.777778	9.444444	223.555556	9.666667
Neighborhood 20	Property type 2	60.000000	1.000000	3.000000	101.000000	6.000000
Neighborhood 21	Property type 1	362.500000	4.250000	49.000000	306.250000	14.750000
Neighborhood 22	Property type 1	225.000000	3.000000	19.000000	500.000000	9.000000
Neighborhood 3	Property type 2	60.000000	2.000000	7.000000	264.000000	9.000000
Neighborhood 4	Property type 2	60.000000	2.000000	10.000000	95.000000	11.000000
Neighborhood 4	Property type 3	40.000000	2.000000	4.000000	241.000000	19.000000
Neighborhood 5	Property type 1	194.500000	2.500000	8.500000	266.500000	11.500000
Neighborhood 6	Property type 1	146.000000	3.333333	12.666667	290.666667	4.000000
Neighborhood 7	Property type 1	161.000000	3.666667	14.333333	343.000000	5.333333
Neighborhood 7	Property type 2	100.000000	2.000000	3.000000	148.000000	2.000000
Neighborhood 8	Property type 1	174.750000	5.000000	11.000000	300.000000	6.750000
Neighborhood 8	Property type 2	350.000000	4.000000	5.000000	223.000000	3.000000
Neighborhood 9	Property type 1	151.142857	4.285714	13.428571	471.428571	5.714286
Neighborhood 9	Property type 2	110.000000	2.000000	3.500000	114.500000	9.000000

	Property type 1	Property type 2	Property type 3
0	1	0	0
1	1	0	0
2	0	1	0
3	0	1	0
4	1	0	0
5	0	1	0
6	0	1	0
7	1	0	0
8	1	0	0
9	0	0	1
10	1	0	0
11	0	1	0
12	0	1	0
13	0	1	0
14	0	1	0
15	0	1	0
16	0	1	0
17	0	1	0
18	0	1	0
19	0	0	1
20	1	0	0
21	1	0	0
22	0	1	0
23	1	0	0
24	1	0	0
25	1	0	0
26	1	0	0
27	0	1	0
28	1	0	0
29	0	1	0
...	...	...	...
90	1	0	0
91	0	1	0
92	0	1	0
93	1	0	0
94	1	0	0
95	1	0	0
96	1	0	0
97	1	0	0
98	1	0	0
99	0	1	0
100	1	0	0
101	0	1	0
102	1	0	0
103	1	0	0
104	1	0	0
105	1	0	0
106	1	0	0
107	1	0	0
108	0	1	0
109	1	0	0
110	0	1	0
111	1	0	0
112	0	1	0
113	1	0	0
114	1	0	0
115	1	0	0
116	0	1	0
117	0	1	0
118	0	1	0
119	0	1	0

	Neighborhood 11	Neighborhood 12	Neighborhood 13	Neighborhood 14	Neighborhood 15	Neighborhood 16	Neighborhood 17	Neighborhood 18	Neighborhood 19	Neighborhood 20	Neighborhood 21	Neighborhood 4	Neighborhood 5	Neighborhood 9
0	0	0	0	1	0	0	0	0	0	0	0	0	0	0
1	0	0	0	1	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	1	0	0	0	0	0	0	0	0
3	0	0	1	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	1	0	0	0	0	0	0	0	0	0
5	0	0	1	0	0	0	0	0	0	0	0	0	0	0
6	0	0	1	0	0	0	0	0	0	0	0	0	0	0
7	0	0	0	0	0	0	0	1	0	0	0	0	0	0
8	0	0	1	0	0	0	0	0	0	0	0	0	0	0
9	0	0	0	0	0	0	1	0	0	0	0	0	0	0
10	0	0	1	0	0	0	0	0	0	0	0	0	0	0
11	0	0	1	0	0	0	0	0	0	0	0	0	0	0
12	0	0	0	1	0	0	0	0	0	0	0	0	0	0
13	0	0	0	0	0	0	1	0	0	0	0	0	0	0
14	1	0	0	0	0	0	0	0	0	0	0	0	0	0
15	0	0	0	1	0	0	0	0	0	0	0	0	0	0
16	0	1	0	0	0	0	0	0	0	0	0	0	0	0
17	0	0	0	1	0	0	0	0	0	0	0	0	0	0
18	0	1	0	0	0	0	0	0	0	0	0	0	0	0
19	0	0	0	0	0	0	0	0	0	0	0	1	0	0
20	0	0	1	0	0	0	0	0	0	0	0	0	0	0
21	0	1	0	0	0	0	0	0	0	0	0	0	0	0
22	0	0	0	1	0	0	0	0	0	0	0	0	0	0
23	0	0	0	0	0	1	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	1	0	0	0	0	0	0
25	0	0	0	1	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	1	0	0	0	0	0	0	0	0	0
27	0	1	0	0	0	0	0	0	0	0	0	0	0	0
28	0	0	0	0	0	0	1	0	0	0	0	0	0	0
29	0	0	0	0	0	0	1	0	0	0	0	0	0	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
90	0	0	0	0	0	0	1	0	0	0	0	0	0	0
91	0	1	0	0	0	0	0	0	0	0	0	0	0	0
92	0	0	0	0	1	0	0	0	0	0	0	0	0	0
93	0	0	0	1	0	0	0	0	0	0	0	0	0	0
94	0	0	0	1	0	0	0	0	0	0	0	0	0	0
95	1	0	0	0	0	0	0	0	0	0	0	0	0	0
96	0	0	0	0	0	0	0	1	0	0	0	0	0	0
97	0	0	1	0	0	0	0	0	0	0	0	0	0	0
98	0	1	0	0	0	0	0	0	0	0	0	0	0	0
99	0	0	0	0	1	0	0	0	0	0	0	0	0	0
100	0	0	0	1	0	0	0	0	0	0	0	0	0	0
101	1	0	0	0	0	0	0	0	0	0	0	0	0	0
102	0	1	0	0	0	0	0	0	0	0	0	0	0	0
103	0	0	0	0	0	0	0	1	0	0	0	0	0	0
104	0	0	0	0	0	0	0	0	0	1	0	0	0	0
105	0	0	0	0	0	0	0	0	0	0	0	0	0	1
106	0	0	0	0	1	0	0	0	0	0	0	0	0	0
107	0	1	0	0	0	0	0	0	0	0	0	0	0	0
108	0	0	1	0	0	0	0	0	0	0	0	0	0	0
109	0	0	0	0	0	0	0	0	0	1	0	0	0	0
110	0	0	0	1	0	0	0	0	0	0	0	0	0	0
111	0	0	0	1	0	0	0	0	0	0	0	0	0	0
112	0	0	1	0	0	0	0	0	0	0	0	0	0	0
113	0	0	1	0	0	0	0	0	0	0	0	0	0	0
114	0	0	1	0	0	0	0	0	0	0	0	0	0	0
115	0	0	0	0	0	0	1	0	0	0	0	0	0	0
116	0	0	0	1	0	0	0	0	0	0	0	0	0	0
117	0	0	0	0	0	0	0	0	0	0	0	1	0	0
118	0	1	0	0	0	0	0	0	0	0	0	0	0	0
119	0	0	0	0	0	0	0	0	0	0	0	0	0	1

	Property type 1	Property type 2	Property type 3
0	1	0	0
1	1	0	0
2	0	1	0
3	0	1	0
4	1	0	0
5	0	1	0
6	0	1	0
7	1	0	0
8	1	0	0
9	0	0	1
10	1	0	0
11	0	1	0
12	0	1	0
13	0	1	0
14	0	1	0
15	0	1	0
16	0	1	0
17	0	1	0
18	0	1	0
19	0	0	1
20	1	0	0
21	1	0	0
22	0	1	0
23	1	0	0
24	1	0	0
25	1	0	0
26	1	0	0
27	0	1	0
28	1	0	0
29	0	1	0
...	...	...	...
90	1	0	0
91	0	1	0
92	0	1	0
93	1	0	0
94	1	0	0
95	1	0	0
96	1	0	0
97	1	0	0
98	1	0	0
99	0	1	0
100	1	0	0
101	0	1	0
102	1	0	0
103	1	0	0
104	1	0	0
105	1	0	0
106	1	0	0
107	1	0	0
108	0	1	0
109	1	0	0
110	0	1	0
111	1	0	0
112	0	1	0
113	1	0	0
114	1	0	0
115	1	0	0
116	0	1	0
117	0	1	0
118	0	1	0
119	0	1	0

	Neighborhood 11	Neighborhood 12	Neighborhood 13	Neighborhood 14	Neighborhood 15	Neighborhood 16	Neighborhood 17	Neighborhood 18	Neighborhood 19	Neighborhood 20	Neighborhood 21	Neighborhood 4	Neighborhood 5	Neighborhood 9
0	0	0	0	1	0	0	0	0	0	0	0	0	0	0
1	0	0	0	1	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	1	0	0	0	0	0	0	0	0
3	0	0	1	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	1	0	0	0	0	0	0	0	0	0
5	0	0	1	0	0	0	0	0	0	0	0	0	0	0
6	0	0	1	0	0	0	0	0	0	0	0	0	0	0
7	0	0	0	0	0	0	0	1	0	0	0	0	0	0
8	0	0	1	0	0	0	0	0	0	0	0	0	0	0
9	0	0	0	0	0	0	1	0	0	0	0	0	0	0
10	0	0	1	0	0	0	0	0	0	0	0	0	0	0
11	0	0	1	0	0	0	0	0	0	0	0	0	0	0
12	0	0	0	1	0	0	0	0	0	0	0	0	0	0
13	0	0	0	0	0	0	1	0	0	0	0	0	0	0
14	1	0	0	0	0	0	0	0	0	0	0	0	0	0
15	0	0	0	1	0	0	0	0	0	0	0	0	0	0
16	0	1	0	0	0	0	0	0	0	0	0	0	0	0
17	0	0	0	1	0	0	0	0	0	0	0	0	0	0
18	0	1	0	0	0	0	0	0	0	0	0	0	0	0
19	0	0	0	0	0	0	0	0	0	0	0	1	0	0
20	0	0	1	0	0	0	0	0	0	0	0	0	0	0
21	0	1	0	0	0	0	0	0	0	0	0	0	0	0
22	0	0	0	1	0	0	0	0	0	0	0	0	0	0
23	0	0	0	0	0	1	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	1	0	0	0	0	0	0
25	0	0	0	1	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	1	0	0	0	0	0	0	0	0	0
27	0	1	0	0	0	0	0	0	0	0	0	0	0	0
28	0	0	0	0	0	0	1	0	0	0	0	0	0	0
29	0	0	0	0	0	0	1	0	0	0	0	0	0	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
90	0	0	0	0	0	0	1	0	0	0	0	0	0	0
91	0	1	0	0	0	0	0	0	0	0	0	0	0	0
92	0	0	0	0	1	0	0	0	0	0	0	0	0	0
93	0	0	0	1	0	0	0	0	0	0	0	0	0	0
94	0	0	0	1	0	0	0	0	0	0	0	0	0	0
95	1	0	0	0	0	0	0	0	0	0	0	0	0	0
96	0	0	0	0	0	0	0	1	0	0	0	0	0	0
97	0	0	1	0	0	0	0	0	0	0	0	0	0	0
98	0	1	0	0	0	0	0	0	0	0	0	0	0	0
99	0	0	0	0	1	0	0	0	0	0	0	0	0	0
100	0	0	0	1	0	0	0	0	0	0	0	0	0	0
101	1	0	0	0	0	0	0	0	0	0	0	0	0	0
102	0	1	0	0	0	0	0	0	0	0	0	0	0	0
103	0	0	0	0	0	0	0	1	0	0	0	0	0	0
104	0	0	0	0	0	0	0	0	0	1	0	0	0	0
105	0	0	0	0	0	0	0	0	0	0	0	0	0	1
106	0	0	0	0	1	0	0	0	0	0	0	0	0	0
107	0	1	0	0	0	0	0	0	0	0	0	0	0	0
108	0	0	1	0	0	0	0	0	0	0	0	0	0	0
109	0	0	0	0	0	0	0	0	0	1	0	0	0	0
110	0	0	0	1	0	0	0	0	0	0	0	0	0	0
111	0	0	0	1	0	0	0	0	0	0	0	0	0	0
112	0	0	1	0	0	0	0	0	0	0	0	0	0	0
113	0	0	1	0	0	0	0	0	0	0	0	0	0	0
114	0	0	1	0	0	0	0	0	0	0	0	0	0	0
115	0	0	0	0	0	0	1	0	0	0	0	0	0	0
116	0	0	0	1	0	0	0	0	0	0	0	0	0	0
117	0	0	0	0	0	0	0	0	0	0	0	1	0	0
118	0	1	0	0	0	0	0	0	0	0	0	0	0	0
119	0	0	0	0	0	0	0	0	0	0	0	0	0	1