Assignment 3

Using the data from the 2013_NYC_CD_MedianIncome_Recycle.xlsx file, create a predictor using the weights from the model. This time, use the built in attributes in your model rather than hard-coding them into your algorithm


In [1]:
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt # package for doing plotting (necessary for adding the line)
import statsmodels.formula.api as smf # package we'll be using for linear regression

In [3]:
df = pd.read_excel("data/2013_NYC_CD_MedianIncome_Recycle.xlsx")
df.head()


Out[3]:
CD_Name MdHHIncE RecycleRate
0 Battery Park City, Greenwich Village & Soho 119596 0.286771
1 Battery Park City, Greenwich Village & Soho 119596 0.264074
2 Chinatown & Lower East Side 40919 0.156485
3 Chelsea, Clinton & Midtown Business Distric 92583 0.235125
4 Chelsea, Clinton & Midtown Business Distric 92583 0.246725

In [4]:
def weight_predict(your_income):
    df = pd.read_excel("data/2013_NYC_CD_MedianIncome_Recycle.xlsx")
    lm = smf.ols(formula="RecycleRate~MdHHIncE",data=df).fit()
    recycle_rate = your_income * lm.params.MdHHIncE + lm.params.Intercept
    return recycle_rate

In [5]:
weight_predict(119596)


Out[5]:
0.29840233275398098

In [ ]: