Using the data from the 2013_NYC_CD_MedianIncome_Recycle.xlsx file, create a predictor using the weights from the model. This time, use the built in attributes in your model rather than hard-coding them into your algorithm



In [1]:

    
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import statsmodels.formula.api as smf

df = pd.read_excel('2013_NYC_CD_MedianIncome_Recycle.xlsx')



In [2]:

    
df.head()









    Out[2]:






  
    
      
      CD_Name
      MdHHIncE
      RecycleRate
    
  
  
    
      0
      Battery Park City, Greenwich Village & Soho
      119596
      0.286771
    
    
      1
      Battery Park City, Greenwich Village & Soho
      119596
      0.264074
    
    
      2
      Chinatown & Lower East Side
      40919
      0.156485
    
    
      3
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.235125
    
    
      4
      Chelsea, Clinton & Midtown Business Distric
      92583
      0.246725



In [3]:

    
df.plot(kind='scatter',x='MdHHIncE',y='RecycleRate')









    Out[3]:





<matplotlib.axes._subplots.AxesSubplot at 0x9633ac8>



In [4]:

    
df.columns = ['location', 'median_income', 'recycle_rate']



In [5]:

    
df.corr()['median_income']['recycle_rate']









    Out[5]:





0.8847831827851157



In [6]:

    
lm = smf.ols(formula="recycle_rate~median_income",data=df).fit()
intercept, slope = lm.params
lm.params









    Out[6]:





Intercept        0.074804
median_income    0.000002
dtype: float64



In [7]:

    
df.plot(kind="scatter",x="median_income",y="recycle_rate")
plt.plot(df["median_income"],slope*df["median_income"]+intercept,"-",color="red") 
plt.xlabel('Median Income')
plt.ylabel('Recycle Rate')









    Out[7]:





<matplotlib.text.Text at 0x989da20>



In [8]:

    
def predicting_recylerate(income):
    return intercept + float(income) * slope



In [9]:

    
x = input('What is the median income of your location? ')

print('Expected recycling rate : ' + str(round(predicting_recylerate(x), 2)))









    



What is the median income of your location? 40919
Expected recycling rate : 0.15

	CD_Name	MdHHIncE	RecycleRate
0	Battery Park City, Greenwich Village & Soho	119596	0.286771
1	Battery Park City, Greenwich Village & Soho	119596	0.264074
2	Chinatown & Lower East Side	40919	0.156485
3	Chelsea, Clinton & Midtown Business Distric	92583	0.235125
4	Chelsea, Clinton & Midtown Business Distric	92583	0.246725