Compute the average temperature by season ('season_desc'). (The temperatures are numbers between 0 and 1, but don't worry about that. Let's say that's the Shellman temperature scale.)



In [1]:

    
import pandas as pd
import numpy as np
from pandas import Series, DataFrame



In [2]:

    
weather = pd.read_table('daily_weather.tsv')



In [3]:

    
weather.groupby('season_desc').agg({'temp': np.mean})









    Out[3]:






  
    
      
      temp
    
    
      season_desc
      
    
  
  
    
      Fall
      0.711445
    
    
      Spring
      0.321700
    
    
      Summer
      0.554557
    
    
      Winter
      0.419368



In [4]:

    
fix = weather.replace("Fall", "Summer_").replace("Summer", "Spring_").replace("Winter", "Fall_").replace("Spring", "Winter_")



In [6]:

    
weather.groupby('season_desc').agg({'temp': np.mean})









    Out[6]:






  
    
      
      temp
    
    
      season_desc
      
    
  
  
    
      Fall
      0.711445
    
    
      Spring
      0.321700
    
    
      Summer
      0.554557
    
    
      Winter
      0.419368

Various of the columns represent dates or datetimes, but out of the box pd.read_table won't treat them correctly. This makes it hard to (for example) compute the number of rentals by month. Fix the dates and compute the number of rentals by month.



In [9]:

    
weather['months'] = pd.DatetimeIndex(weather.date).month



In [10]:

    
weather.groupby('months').agg({'total_riders': np.sum})









    Out[10]:






  
    
      
      total_riders
    
    
      months
      
    
  
  
    
      1
      96744
    
    
      2
      103137
    
    
      3
      164875
    
    
      4
      174224
    
    
      5
      195865
    
    
      6
      202830
    
    
      7
      203607
    
    
      8
      214503
    
    
      9
      218573
    
    
      10
      198841
    
    
      11
      152664
    
    
      12
      123713

weather[['total_riders', 'temp']].corr()

3.Investigate how the number of rentals varies with temperature. Is this trend constant across seasons? Across months?



In [11]:

    
weather[['total_riders', 'temp', 'months']].groupby('months').corr()









    Out[11]:






  
    
      
      
      temp
      total_riders
    
    
      months
      
      
      
    
  
  
    
      1
      temp
      1.000000
      0.689495
    
    
      total_riders
      0.689495
      1.000000
    
    
      2
      temp
      1.000000
      0.716206
    
    
      total_riders
      0.716206
      1.000000
    
    
      3
      temp
      1.000000
      0.735575
    
    
      total_riders
      0.735575
      1.000000
    
    
      4
      temp
      1.000000
      0.533387
    
    
      total_riders
      0.533387
      1.000000
    
    
      5
      temp
      1.000000
      0.065599
    
    
      total_riders
      0.065599
      1.000000
    
    
      6
      temp
      1.000000
      -0.330884
    
    
      total_riders
      -0.330884
      1.000000
    
    
      7
      temp
      1.000000
      -0.184704
    
    
      total_riders
      -0.184704
      1.000000
    
    
      8
      temp
      1.000000
      0.288264
    
    
      total_riders
      0.288264
      1.000000
    
    
      9
      temp
      1.000000
      -0.418753
    
    
      total_riders
      -0.418753
      1.000000
    
    
      10
      temp
      1.000000
      0.466666
    
    
      total_riders
      0.466666
      1.000000
    
    
      11
      temp
      1.000000
      0.511232
    
    
      total_riders
      0.511232
      1.000000
    
    
      12
      temp
      1.000000
      0.690062
    
    
      total_riders
      0.690062
      1.000000

weather[['total_riders', 'temp', 'season_desc']].groupby('season_desc').corr()



In [12]:

    
weather[['no_casual_riders', 'no_reg_riders', 'temp']].corr()









    Out[12]:






  
    
      
      no_casual_riders
      no_reg_riders
      temp
    
  
  
    
      no_casual_riders
      1.000000
      0.274984
      0.542253
    
    
      no_reg_riders
      0.274984
      1.000000
      0.607425
    
    
      temp
      0.542253
      0.607425
      1.000000

4.There are various types of users in the usage data sets. What sorts of things can you say about how they use the bikes differently?



In [13]:

    
weather[['no_casual_riders', 'no_reg_riders']].corr()









    Out[13]:






  
    
      
      no_casual_riders
      no_reg_riders
    
  
  
    
      no_casual_riders
      1.000000
      0.274984
    
    
      no_reg_riders
      0.274984
      1.000000



In [16]:

    
weather[['is_holiday', 'total_riders']].sum()









    Out[16]:





is_holiday           11
total_riders    2049576
dtype: int64



In [15]:

    
weather[['is_holiday', 'total_riders']].corr()









    Out[15]:






  
    
      
      is_holiday
      total_riders
    
  
  
    
      is_holiday
      1.000000
      -0.118134
    
    
      total_riders
      -0.118134
      1.000000



In [ ]:



In [ ]:

	temp
season_desc
Fall	0.711445
Spring	0.321700
Summer	0.554557
Winter	0.419368

	total_riders
months
1	96744
2	103137
3	164875
4	174224
5	195865
6	202830
7	203607
8	214503
9	218573
10	198841
11	152664
12	123713

		temp	total_riders
months
1	temp	1.000000	0.689495
1	total_riders	0.689495	1.000000
2	temp	1.000000	0.716206
2	total_riders	0.716206	1.000000
3	temp	1.000000	0.735575
3	total_riders	0.735575	1.000000
4	temp	1.000000	0.533387
4	total_riders	0.533387	1.000000
5	temp	1.000000	0.065599
5	total_riders	0.065599	1.000000
6	temp	1.000000	-0.330884
6	total_riders	-0.330884	1.000000
7	temp	1.000000	-0.184704
7	total_riders	-0.184704	1.000000
8	temp	1.000000	0.288264
8	total_riders	0.288264	1.000000
9	temp	1.000000	-0.418753
9	total_riders	-0.418753	1.000000
10	temp	1.000000	0.466666
10	total_riders	0.466666	1.000000
11	temp	1.000000	0.511232
11	total_riders	0.511232	1.000000
12	temp	1.000000	0.690062
12	total_riders	0.690062	1.000000

	no_casual_riders	no_reg_riders	temp
no_casual_riders	1.000000	0.274984	0.542253
no_reg_riders	0.274984	1.000000	0.607425
temp	0.542253	0.607425	1.000000