重回帰分析(Multiple Regression Analysis)

Author: Yoshimasa Ogawa
LastModified: 2015-12-23

圓川隆夫『多変量のデータ解析』(1988年, 朝倉書店)第2章「重回帰分析」をPythonで実行します。



In [1]:

    
%matplotlib inline
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')



In [2]:

    
# データ読み込み
data = pd.read_csv('data/tab26.csv')
data.head()



In [3]:

    
# 説明変数設定
X = data[['x1', 'x2', 'x3', 'x4']]
X = sm.add_constant(X)
# 非説明変数設定
Y = data['y']



In [4]:

    
# OLSの実行
model1 = sm.OLS(Y,X)
results1 = model1.fit()
results1.summary()









    Out[4]:





OLS Regression Results

  Dep. Variable:             y           R-squared:             0.897


  Model:                    OLS          Adj. R-squared:        0.814


  Method:              Least Squares     F-statistic:           10.86


  Date:              Wed, 23 Dec 2015    Prob (F-statistic):   0.0111 


  Time:                  00:28:14        Log-Likelihood:      -9.7935


  No. Observations:           10         AIC:                   29.59


  Df Residuals:                5         BIC:                   31.10


  Df Model:                    4                                     


  Covariance Type:       nonrobust                                   




           coef      std err       t       P>|t|  [95.0% Conf. Int.] 


  const      2.2947      0.904      2.539   0.052     -0.029     4.618


  x1         0.5771      0.395      1.463   0.203     -0.437     1.591


  x2         1.4843      0.723      2.053   0.095     -0.374     3.343


  x3        -0.8356      0.707     -1.182   0.290     -2.653     0.981


  x4         0.5600      1.151      0.487   0.647     -2.398     3.518




  Omnibus:         0.141    Durbin-Watson:         2.294


  Prob(Omnibus):   0.932    Jarque-Bera (JB):      0.156


  Skew:            0.162    Prob(JB):              0.925


  Kurtosis:        2.483    Cond. No.               28.8



In [5]:

    
# 説明変数設定
X = data[['x1', 'x2', 'x3']]
X = sm.add_constant(X)
# OLSの実行
model2 = sm.OLS(Y,X)
results2 = model2.fit()
results2.summary()









    Out[5]:





OLS Regression Results

  Dep. Variable:             y           R-squared:             0.892


  Model:                    OLS          Adj. R-squared:        0.838


  Method:              Least Squares     F-statistic:           16.51


  Date:              Wed, 23 Dec 2015    Prob (F-statistic):   0.00265


  Time:                  00:28:14        Log-Likelihood:      -10.025


  No. Observations:           10         AIC:                   28.05


  Df Residuals:                6         BIC:                   29.26


  Df Model:                    3                                     


  Covariance Type:       nonrobust                                   




           coef      std err       t       P>|t|  [95.0% Conf. Int.] 


  const      2.5003      0.746      3.350   0.015      0.674     4.327


  x1         0.7599      0.112      6.765   0.001      0.485     1.035


  x2         1.3900      0.651      2.136   0.077     -0.202     2.982


  x3        -0.8633      0.658     -1.311   0.238     -2.474     0.747




  Omnibus:         0.107    Durbin-Watson:         2.260


  Prob(Omnibus):   0.948    Jarque-Bera (JB):      0.300


  Skew:            0.158    Prob(JB):              0.861


  Kurtosis:        2.213    Cond. No.               19.5



In [6]:

    
# 説明変数設定
X = data[['x1', 'x2']]
X = sm.add_constant(X)
# OLSの実行
model3 = sm.OLS(Y,X)
results3 = model3.fit()
results3.summary()









    Out[6]:





OLS Regression Results

  Dep. Variable:             y           R-squared:             0.861


  Model:                    OLS          Adj. R-squared:        0.821


  Method:              Least Squares     F-statistic:           21.67


  Date:              Wed, 23 Dec 2015    Prob (F-statistic):   0.00100


  Time:                  00:28:14        Log-Likelihood:      -11.285


  No. Observations:           10         AIC:                   28.57


  Df Residuals:                7         BIC:                   29.48


  Df Model:                    2                                     


  Covariance Type:       nonrobust                                   




           coef      std err       t       P>|t|  [95.0% Conf. Int.] 


  const      2.6155      0.778      3.360   0.012      0.775     4.456


  x1         0.7369      0.117      6.324   0.000      0.461     1.012


  x2         1.0233      0.617      1.658   0.141     -0.436     2.483




  Omnibus:         0.240    Durbin-Watson:         2.132


  Prob(Omnibus):   0.887    Jarque-Bera (JB):      0.396


  Skew:           -0.021    Prob(JB):              0.820


  Kurtosis:        2.026    Cond. No.               18.0



In [7]:

    
# 説明変数設定
X = data[['x1']]
X = sm.add_constant(X)
# OLSの実行
model4 = sm.OLS(Y,X)
results4 = model4.fit()
results4.summary()









    Out[7]:





OLS Regression Results

  Dep. Variable:             y           R-squared:             0.806


  Model:                    OLS          Adj. R-squared:        0.782


  Method:              Least Squares     F-statistic:           33.31


  Date:              Wed, 23 Dec 2015    Prob (F-statistic):  0.000419


  Time:                  00:28:14        Log-Likelihood:      -12.942


  No. Observations:           10         AIC:                   29.88


  Df Residuals:                8         BIC:                   30.49


  Df Model:                    1                                     


  Covariance Type:       nonrobust                                   




           coef      std err       t       P>|t|  [95.0% Conf. Int.] 


  const      3.3053      0.726      4.551   0.002      1.630     4.980


  x1         0.7421      0.129      5.771   0.000      0.446     1.039




  Omnibus:         1.684    Durbin-Watson:         2.321


  Prob(Omnibus):   0.431    Jarque-Bera (JB):      0.731


  Skew:           -0.652    Prob(JB):              0.694


  Kurtosis:        2.766    Cond. No.               13.5



In [8]:

    
# モデル選択
criteria = pd.DataFrame(index=['results1', 'results2', 'results3', 'results4'])
criteria["AIC"] = [results1.aic, results2.aic, results3.aic, results4.aic]
criteria["BIC"] = [results1.bic, results2.bic, results3.bic, results4.bic]
criteria

	AIC	BIC
results1	29.587070	31.099996
results2	28.049772	29.260113
results3	28.570178	29.477933
results4	29.883355	30.488525

Dep. Variable:	y	R-squared:	0.897
Model:	OLS	Adj. R-squared:	0.814
Method:	Least Squares	F-statistic:	10.86
Date:	Wed, 23 Dec 2015	Prob (F-statistic):	0.0111
Time:	00:28:14	Log-Likelihood:	-9.7935
No. Observations:	10	AIC:	29.59
Df Residuals:	5	BIC:	31.10
Df Model:	4
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[95.0% Conf. Int.]
const	2.2947	0.904	2.539	0.052	-0.029 4.618
x1	0.5771	0.395	1.463	0.203	-0.437 1.591
x2	1.4843	0.723	2.053	0.095	-0.374 3.343
x3	-0.8356	0.707	-1.182	0.290	-2.653 0.981
x4	0.5600	1.151	0.487	0.647	-2.398 3.518

Omnibus:	0.141	Durbin-Watson:	2.294
Prob(Omnibus):	0.932	Jarque-Bera (JB):	0.156
Skew:	0.162	Prob(JB):	0.925
Kurtosis:	2.483	Cond. No.	28.8

Dep. Variable:	y	R-squared:	0.892
Model:	OLS	Adj. R-squared:	0.838
Method:	Least Squares	F-statistic:	16.51
Date:	Wed, 23 Dec 2015	Prob (F-statistic):	0.00265
Time:	00:28:14	Log-Likelihood:	-10.025
No. Observations:	10	AIC:	28.05
Df Residuals:	6	BIC:	29.26
Df Model:	3
Covariance Type:	nonrobust

Omnibus:	0.107	Durbin-Watson:	2.260
Prob(Omnibus):	0.948	Jarque-Bera (JB):	0.300
Skew:	0.158	Prob(JB):	0.861
Kurtosis:	2.213	Cond. No.	19.5

Omnibus:	0.240	Durbin-Watson:	2.132
Prob(Omnibus):	0.887	Jarque-Bera (JB):	0.396
Skew:	-0.021	Prob(JB):	0.820
Kurtosis:	2.026	Cond. No.	18.0

Dep. Variable:	y	R-squared:	0.806
Model:	OLS	Adj. R-squared:	0.782
Method:	Least Squares	F-statistic:	33.31
Date:	Wed, 23 Dec 2015	Prob (F-statistic):	0.000419
Time:	00:28:14	Log-Likelihood:	-12.942
No. Observations:	10	AIC:	29.88
Df Residuals:	8	BIC:	30.49
Df Model:	1
Covariance Type:	nonrobust

Omnibus:	1.684	Durbin-Watson:	2.321
Prob(Omnibus):	0.431	Jarque-Bera (JB):	0.731
Skew:	-0.652	Prob(JB):	0.694
Kurtosis:	2.766	Cond. No.	13.5