Assignment 3

Using the data from the 2013_NYC_CD_MedianIncome_Recycle.xlsx file, create a predictor using the weights from the model. This time, use the built in attributes in your model rather than hard-coding them into your algorithm



In [7]:

    
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt 
import statsmodels.formula.api as smf 
!pip3 install xlrd









    



Collecting xlrd
  Using cached xlrd-1.0.0-py3-none-any.whl
Installing collected packages: xlrd
Successfully installed xlrd-1.0.0



In [10]:

    
df = pd.read_excel("data/2013_NYC_CD_MedianIncome_Recycle.xlsx")



In [11]:

    
df.head()









    Out[11]:






  
    
      
      CD_Name
      MdHHIncE
      RecycleRate
    
  
  
    
      0
      Battery Park City, Greenwich Village & Soho
      119596
      0.286771
    
    
      1
      Battery Park City, Greenwich Village & Soho
      119596
      0.264074
    
    
      2
      Chinatown & Lower East Side
      40919
      0.156485
    
    
      3
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.235125
    
    
      4
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.246725



In [20]:

    
lm = smf.ols(formula="RecycleRate~MdHHIncE",data=df).fit()



In [21]:

    
lm.params









    Out[21]:





Intercept    0.074804
MdHHIncE     0.000002
dtype: float64



In [22]:

    
intercept, slope = lm.params



In [27]:

    
df.plot(kind='scatter',x='MdHHIncE',y='RecycleRate',color='gray',alpha=0.8,linewidth=0)
plt.plot(df["MdHHIncE"],slope*df["MdHHIncE"]+intercept,"-",color="red",alpha=0.5)









    Out[27]:





[<matplotlib.lines.Line2D at 0x10b883828>]



In [28]:

    
print("The module is: Recycle rate =", slope,"* medianincome +",intercept)









    



The module is: Recycle rate = 1.86961266766e-06 * medianincome + 0.0748041361524



In [31]:

    
def get_rrate(income):
    recycle_rate = income * slope + intercept
    return recycle_rate



In [ ]:

	CD_Name	MdHHIncE	RecycleRate
0	Battery Park City, Greenwich Village & Soho	119596	0.286771
1	Battery Park City, Greenwich Village & Soho	119596	0.264074
2	Chinatown & Lower East Side	40919	0.156485
3	Chelsea, Clinton & Midtown Business Distric	92583	0.235125
4	Chelsea, Clinton & Midtown Business Distric	92583	0.246725