In [2]:
df1 = pd.read_csv("../nyc.csv")
df1.tail()
Out[2]:
Case
Restaurant
Price
Food
Decor
Service
East
163
164
Baci
31
17
15
16
0
164
165
Puccini
26
20
16
17
0
165
166
Bella Luna
31
18
16
17
0
166
167
M�tisse
38
22
17
21
0
167
168
Gennaro
34
24
10
16
0
In [3]:
model1 = sm.OLS.from_formula("Price ~ Food + Decor + Service + C(East)", data=df1)
result1 = model1.fit()
print(result1.summary())
OLS Regression Results
==============================================================================
Dep. Variable: Price R-squared: 0.628
Model: OLS Adj. R-squared: 0.619
Method: Least Squares F-statistic: 68.76
Date: Mon, 13 Jun 2016 Prob (F-statistic): 5.35e-34
Time: 04:18:56 Log-Likelihood: -529.36
No. Observations: 168 AIC: 1069.
Df Residuals: 163 BIC: 1084.
Df Model: 4
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept -24.0238 4.708 -5.102 0.000 -33.321 -14.727
C(East)[T.1] 2.0681 0.947 2.184 0.030 0.199 3.938
Food 1.5381 0.369 4.169 0.000 0.810 2.267
Decor 1.9101 0.217 8.802 0.000 1.482 2.339
Service -0.0027 0.396 -0.007 0.995 -0.785 0.780
==============================================================================
Omnibus: 5.180 Durbin-Watson: 1.760
Prob(Omnibus): 0.075 Jarque-Bera (JB): 5.039
Skew: 0.304 Prob(JB): 0.0805
Kurtosis: 3.591 Cond. No. 357.
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In [4]:
sm.stats.anova_lm(result1)
Out[4]:
df
sum_sq
mean_sq
F
PR(>F)
C(East)
1.0
502.313658
502.313658
15.257019
1.371938e-04
Food
1.0
5248.583781
5248.583781
159.417813
6.347591e-26
Decor
1.0
3304.097144
3304.097144
100.356965
1.045660e-18
Service
1.0
0.001560
0.001560
0.000047
9.945162e-01
Residual
163.0
5366.521715
32.923446
NaN
NaN
In [6]:
df20 = pd.read_csv("../cars04.csv")
df2 = df20[["EngineSize","Cylinders","Horsepower","HighwayMPG","Weight","WheelBase","Hybrid","SuggestedRetailPrice"]]
df2.head()
Out[6]:
EngineSize
Cylinders
Horsepower
HighwayMPG
Weight
WheelBase
Hybrid
SuggestedRetailPrice
0
1.6
4
103
34
2370
98
0
11690
1
1.6
4
103
34
2348
98
0
12585
2
2.2
4
140
37
2617
104
0
14610
3
2.2
4
140
37
2676
104
0
14810
4
2.2
4
140
37
2617
104
0
16385
In [ ]:
model2 = sm.OLS.from_formula("SuggestedRetailPrice ~ EngineSize + ")
Content source: rrbb014/fastcampus_dss
Similar notebooks: