Using the data from the 2013_NYC_CD_MedianIncome_Recycle.xlsx file, create a predictor using the weights from the model. This time, use the built in attributes in your model rather than hard-coding them into your algorithm.



In [1]:

    
import pandas as pd
import statsmodels.formula.api as smf



In [2]:

    
df=pd.read_excel("2013_NYC_CD_MedianIncome_Recycle.xlsx")



In [3]:

    
df.head()









    Out[3]:






  
    
      
      CD_Name
      MdHHIncE
      RecycleRate
    
  
  
    
      0
      Battery Park City, Greenwich Village & Soho
      119596
      0.286771
    
    
      1
      Battery Park City, Greenwich Village & Soho
      119596
      0.264074
    
    
      2
      Chinatown & Lower East Side
      40919
      0.156485
    
    
      3
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.235125
    
    
      4
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.246725



In [8]:

    
df.columns = ['Neighborhood', 'Median_Income', 'Recycle_Rate']



In [9]:

    
df.head()









    Out[9]:






  
    
      
      Neighborhood
      Median_Income
      Recycle_Rate
    
  
  
    
      0
      Battery Park City, Greenwich Village & Soho
      119596
      0.286771
    
    
      1
      Battery Park City, Greenwich Village & Soho
      119596
      0.264074
    
    
      2
      Chinatown & Lower East Side
      40919
      0.156485
    
    
      3
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.235125
    
    
      4
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.246725



In [10]:

    
lm = smf.ols(formula="Recycle_Rate~Median_Income",data=df).fit()



In [12]:

    
lm.params #get the parameters from the model fit









    Out[12]:





Intercept        0.074804
Median_Income    0.000002
dtype: float64



In [11]:

    
intercept, slope = lm.params #assign those values to variables



In [13]:

    
def pre_recycle(median_income):
    recycle_rate = intercept + slope*median_income
    return recycle_rate



In [17]:

    
pre_recycle(92000)









    Out[17]:





0.24680850157721865



In [ ]:

	CD_Name	MdHHIncE	RecycleRate
0	Battery Park City, Greenwich Village & Soho	119596	0.286771
1	Battery Park City, Greenwich Village & Soho	119596	0.264074
2	Chinatown & Lower East Side	40919	0.156485
3	Chelsea, Clinton & Midtown Business Distric	92583	0.235125
4	Chelsea, Clinton & Midtown Business Distric	92583	0.246725