Assignment 3

Using the data from the 2013_NYC_CD_MedianIncome_Recycle.xlsx file, create a predictor using the weights from the model. This time, use the built in attributes in your model rather than hard-coding them into your algorithm



In [1]:

    
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import statsmodels.formula.api as smf



In [3]:

    
df = pd.read_excel("/home/sean/git/algorithms/class4/homework/data/2013_NYC_CD_MedianIncome_Recycle.xlsx")



In [4]:

    
df.head()









    Out[4]:






  
    
      
      CD_Name
      MdHHIncE
      RecycleRate
    
  
  
    
      0
      Battery Park City, Greenwich Village & Soho
      119596
      0.286771
    
    
      1
      Battery Park City, Greenwich Village & Soho
      119596
      0.264074
    
    
      2
      Chinatown & Lower East Side
      40919
      0.156485
    
    
      3
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.235125
    
    
      4
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.246725



In [46]:

    
lm = smf.ols(formula="RecycleRate~MdHHIncE",data=df).fit()



In [47]:

    
lm.params









    Out[47]:





Intercept    0.074804
MdHHIncE     0.000002
dtype: float64



In [65]:

    
intercept, slope = lm.params



In [57]:

    
slope









    Out[57]:





0.074804136152441802



In [58]:

    
intercept









    Out[58]:





1.8696126676606164e-06



In [63]:

    
plt.yscale?



In [67]:

    
df.plot(kind='scatter', x='MdHHIncE', y='RecycleRate')
plt.plot(df["MdHHIncE"],slope*df["MdHHIncE"]+intercept,"-", c='red')









    Out[67]:





[<matplotlib.lines.Line2D at 0x7faf9250d128>]



In [85]:

    
def income_to_rate(income_str):
    income=float(income_str)
    return '%s' % float('%.3g' % ((slope*income+intercept)*100))



In [86]:

    
income_to_rate(50000)









    Out[86]:





'16.8'



In [90]:

    
income=input('Enter median neighborhood income: $')
print('Predicted recycle rate for this neighborhood is {}%'.format(income_to_rate(income)))









    



Enter median neighborhood income: $1000000
Predicted recycle rate for this neighborhood is 194.0%



In [ ]:

	CD_Name	MdHHIncE	RecycleRate
0	Battery Park City, Greenwich Village & Soho	119596	0.286771
1	Battery Park City, Greenwich Village & Soho	119596	0.264074
2	Chinatown & Lower East Side	40919	0.156485
3	Chelsea, Clinton & Midtown Business Distric	92583	0.235125
4	Chelsea, Clinton & Midtown Business Distric	92583	0.246725