Use the data from heights_weights_genders.csv to create a simple predictor that takes in a person's height and guesses their weight based on a model using all the data, regardless of gender. To do this, find the parameters (lm.params) and use those in your function (i.e. don't generate a model each time)
In [1]:
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt # package for doing plotting (necessary for adding the line)
import statsmodels.formula.api as smf # package we'll be using for linear regression
In [3]:
df = pd.read_csv("heights_weights_genders.csv")
In [4]:
df.plot(kind="scatter",x="Height",y="Weight")
Out[4]:
In [5]:
lm = smf.ols(formula="Weight~Height",data=df).fit() #notice the formula regresses Y on X (Y~X)
In [6]:
lm.params
Out[6]:
In [7]:
intercept, slope = lm.params
In [8]:
df.plot(kind="scatter",x="Height",y="Weight")
#give it df['Height'] as X, and "slope*df["Height"]+intercept" as Y, this is y = mx + b. magical!
plt.plot(df["Height"],slope*df["Height"]+intercept,"-",color="orange") #we create the best fit line from the values in the fit model
Out[8]:
In [9]:
def get_weight(height):
return slope*height+intercept
In [17]:
get_weight(6)
Out[17]:
In [ ]:
In [ ]: