Statistics using SAS

proc freq
proc means
proc corr
proc sgplot
proc sgscatter
proc anova
proc univariate
proc ttest
proc reg
proc glm
proc glmselect
proc plm

ods graphics on/off

  PROC REG < options > ;
  < label: > MODEL dependents=<regressors></ options > ;
  BY variables ;
  FREQ variable ;
  ID variables ;
  VAR variables ;
  WEIGHT variable ;
  ADD variables ;
  DELETE variables ;
  < label: > MTEST <equation, : : : ,equation> </ options > ;
  OUTPUT < OUT=SAS-data-set > keyword=names
  < : : : keyword=names > ;
  PAINT <condition j ALLOBS>
  < / options > j < STATUS | UNDO> ;
  PLOT <yvariable*xvariable> <=symbol>
  < : : :yvariable*xvariable> <=symbol> </ options > ;
  PRINT < options > < ANOVA > < MODELDATA > ;
  REFIT;
  RESTRICT equation, : : : ,equation ;
  REWEIGHT <condition j ALLOBS>
  < / options > j < STATUS | UNDO> ;
  < label: > TEST equation,<; : : :,equation> </ option > ;



In [1]:

    
%let path=/folders/myfolders/ECST131;
libname statdata "&path";









    Out[1]:









  
  
  




11   ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
12   
13   %let path=/folders/myfolders/ECST131;
14   libname statdata "&path";
NOTE: Libref STATDATA was successfully assigned as follows: 
      Engine:        V9 
      Physical Name: /folders/myfolders/ECST131
15   ods html5 close;ods listing;

16

Proc Corr

Standard error for correlation:

$$St_{\rho} = \sqrt{\frac{1-r^{2}}{n-2}}$$

The NOSIMPLE option tells the procedure that you do not want the default output of means and standard deviations for each of the variables in the VAR and WITH lists. The RANK option says to order the correlations from largest to smallest (by their absolute values).

Two types of plots are available when you use ODS Graphics with PROC CORR, a panel or matrix of scatter plots (as the one above) or individual scatter plots.

The ONLY option says that you want only separate bivariate plots for each variable pair, rather than the scatter plot matrix that is produced by default. By default, the maximum number of individual plots is also set at five.

You can use the same NVAR= option to request additional scatter plots as you used with the matrix plot.

The RANK option says to order the correlations from largest to smallest (by their absolute values).



In [2]:

    
proc print data=statdata.fitness;
run;









    Out[2]:










SAS Output





The SAS System 







Obs
Name
Gender
RunTime
Age
Weight
Oxygen_Consumption
Run_Pulse
Rest_Pulse
Maximum_Pulse
Performance




1
Donna
F
8.17
42
68.15
59.57
166
40
172
90


2
Gracie
F
8.63
38
81.87
60.06
170
48
186
94


3
Luanne
F
8.65
43
85.84
54.30
156
45
168
83


4
Mimi
F
8.92
50
70.87
54.63
146
48
155
67


5
Chris
M
8.95
49
81.42
49.16
180
44
185
72


6
Allen
M
9.22
38
89.02
49.87
178
55
180
92


7
Nancy
F
9.40
49
76.32
48.67
186
56
188
64


8
Patty
F
9.63
52
76.32
45.44
164
48
166
56


9
Suzanne
F
9.93
57
59.08
50.55
148
49
155
43


10
Teresa
F
10.00
51
77.91
46.67
162
48
168
54


11
Bob
M
10.07
40
75.07
45.31
185
62
185
79


12
Harriett
F
10.08
49
73.37
50.39
168
67
168
57


13
Jane
F
10.13
44
73.03
50.54
168
45
168
67


14
Harold
M
10.25
48
91.63
46.77
162
48
164
61


15
Sammy
M
10.33
54
83.12
51.85
166
50
170
49


16
Buffy
F
10.47
52
73.71
45.79
186
59
188
47


17
Trent
M
10.50
52
82.78
47.47
170
53
172
51


18
Jackie
F
10.60
47
79.15
47.27
162
47
164
56


19
Ralph
M
10.85
43
81.19
49.09
162
64
170
65


20
Jack
M
10.95
51
69.63
40.84
168
57
172
48


21
Annie
F
11.08
51
67.25
45.12
172
48
172
43


22
Kate
F
11.12
45
66.45
44.75
176
51
176
55


23
Carl
M
11.17
54
79.38
46.08
156
62
165
40


24
Don
M
11.37
44
89.47
44.61
178
62
182
58


25
Effie
F
11.50
48
61.24
47.92
170
52
176
45


26
George
M
11.63
47
77.45
44.81
176
58
176
50


27
Iris
F
11.95
40
75.98
45.68
176
70
180
56


28
Mark
M
12.63
57
73.37
39.41
174
58
176
20


29
Steve
M
12.88
54
91.63
39.20
168
44
172
23


30
Vaughn
M
13.08
44
81.42
39.44
174
63
176
41


31
William
M
14.03
45
87.66
37.39
186
56
192
30



In [3]:

    
proc corr data=statdata.fitness rank pearson spearman
     plots(only)=scatter(nvar=all ellipse=none);
   var RunTime Age Weight Run_Pulse
       Rest_Pulse Maximum_Pulse Performance;
   with Oxygen_Consumption;
   title "Correlations and Scatter Plots with Oxygen_Consumption";
run;
title;









    Out[3]:










SAS Output





Correlations and Scatter Plots with Oxygen_Consumption 


The CORR Procedure







1 With Variables:
Oxygen_Consumption


7      Variables:
RunTime            Age                Weight             Run_Pulse          Rest_Pulse         Maximum_Pulse      Performance










Simple Statistics


Variable
N
Mean
Std Dev
Median
Minimum
Maximum




Oxygen_Consumption
31
47.37581
5.32777
46.77000
37.39000
60.06000


RunTime
31
10.58613
1.38741
10.47000
8.17000
14.03000


Age
31
47.67742
5.26236
48.00000
38.00000
57.00000


Weight
31
77.44452
8.32857
77.45000
59.08000
91.63000


Run_Pulse
31
169.64516
10.25199
170.00000
146.00000
186.00000


Rest_Pulse
31
53.45161
7.61944
52.00000
40.00000
70.00000


Maximum_Pulse
31
173.77419
9.16410
172.00000
155.00000
192.00000


Performance
31
56.64516
18.32584
56.00000
20.00000
94.00000










Pearson Correlation Coefficients, N = 31
Prob > |r| under H0: Rho=0





Oxygen_Consumption


RunTime
-0.86219
<.0001




Performance
0.77890
<.0001




Rest_Pulse
-0.39935
0.0260




Run_Pulse
-0.39808
0.0266




Age
-0.31162
0.0879




Maximum_Pulse
-0.23677
0.1997




Weight
-0.16289
0.3813












Spearman Correlation Coefficients, N = 31
Prob > |r| under H0: Rho=0





Oxygen_Consumption


RunTime
-0.80806
<.0001




Performance
0.65503
<.0001




Run_Pulse
-0.43748
0.0138




Rest_Pulse
-0.38028
0.0348




Maximum_Pulse
-0.32239
0.0769




Age
-0.19327
0.2975




Weight
-0.09318
0.6181








Correlations and Scatter Plots with Oxygen_Consumption 


The CORR Procedure

IMAGEMAP=ON option after a slash in the ODS GRAPHICS statement enables the tooltip feature.



In [4]:

    
ods graphics on / imagemap=on;
proc corr data=statdata.fitness 
     plots=matrix(nvar=all histogram); 
   var RunTime Age Weight Run_Pulse
       Rest_Pulse Maximum_Pulse Performance;
   id name;
   title "Correlation Matrix and Scatter Plot Matrix of Fitness Predictors";
run;
title;









    Out[4]:










SAS Output





Correlation Matrix and Scatter Plot Matrix of Fitness Predictors 


The CORR Procedure







7  Variables:
RunTime       Age           Weight        Run_Pulse     Rest_Pulse    Maximum_Pulse Performance










Simple Statistics


Variable
N
Mean
Std Dev
Sum
Minimum
Maximum




RunTime
31
10.58613
1.38741
328.17000
8.17000
14.03000


Age
31
47.67742
5.26236
1478
38.00000
57.00000


Weight
31
77.44452
8.32857
2401
59.08000
91.63000


Run_Pulse
31
169.64516
10.25199
5259
146.00000
186.00000


Rest_Pulse
31
53.45161
7.61944
1657
40.00000
70.00000


Maximum_Pulse
31
173.77419
9.16410
5387
155.00000
192.00000


Performance
31
56.64516
18.32584
1756
20.00000
94.00000










Pearson Correlation Coefficients, N = 31
Prob > |r| under H0: Rho=0



 
RunTime
Age
Weight
Run_Pulse
Rest_Pulse
Maximum_Pulse
Performance




RunTime


1.00000
 




0.19523
0.2926




0.14351
0.4412




0.31365
0.0858




0.45038
0.0110




0.22610
0.2213




-0.82049
<.0001




Age


0.19523
0.2926




1.00000
 




-0.24050
0.1925




-0.31607
0.0832




-0.15087
0.4178




-0.41490
0.0203




-0.71257
<.0001




Weight


0.14351
0.4412




-0.24050
0.1925




1.00000
 




0.18152
0.3284




0.04397
0.8143




0.24938
0.1761




0.08974
0.6312




Run_Pulse


0.31365
0.0858




-0.31607
0.0832




0.18152
0.3284




1.00000
 




0.35246
0.0518




0.92975
<.0001




-0.02943
0.8751




Rest_Pulse


0.45038
0.0110




-0.15087
0.4178




0.04397
0.8143




0.35246
0.0518




1.00000
 




0.30512
0.0951




-0.22560
0.2224




Maximum_Pulse


0.22610
0.2213




-0.41490
0.0203




0.24938
0.1761




0.92975
<.0001




0.30512
0.0951




1.00000
 




0.09002
0.6301




Performance


-0.82049
<.0001




-0.71257
<.0001




0.08974
0.6312




-0.02943
0.8751




-0.22560
0.2224




0.09002
0.6301




1.00000



In [5]:

    
ods graphics on / imagemap=off;
proc corr data=statdata.fitness 
   plots=(matrix scatter); 
   var RunTime Age Weight Run_Pulse
       Rest_Pulse Maximum_Pulse Performance;
   id name;
   title "Correlation Matrix and Scatter Plot Matrix of Fitness Predictors";
run;
title;









    Out[5]:










SAS Output





Correlation Matrix and Scatter Plot Matrix of Fitness Predictors 


The CORR Procedure







7  Variables:
RunTime       Age           Weight        Run_Pulse     Rest_Pulse    Maximum_Pulse Performance










Simple Statistics


Variable
N
Mean
Std Dev
Sum
Minimum
Maximum




RunTime
31
10.58613
1.38741
328.17000
8.17000
14.03000


Age
31
47.67742
5.26236
1478
38.00000
57.00000


Weight
31
77.44452
8.32857
2401
59.08000
91.63000


Run_Pulse
31
169.64516
10.25199
5259
146.00000
186.00000


Rest_Pulse
31
53.45161
7.61944
1657
40.00000
70.00000


Maximum_Pulse
31
173.77419
9.16410
5387
155.00000
192.00000


Performance
31
56.64516
18.32584
1756
20.00000
94.00000










Pearson Correlation Coefficients, N = 31
Prob > |r| under H0: Rho=0



 
RunTime
Age
Weight
Run_Pulse
Rest_Pulse
Maximum_Pulse
Performance




RunTime


1.00000
 




0.19523
0.2926




0.14351
0.4412




0.31365
0.0858




0.45038
0.0110




0.22610
0.2213




-0.82049
<.0001




Age


0.19523
0.2926




1.00000
 




-0.24050
0.1925




-0.31607
0.0832




-0.15087
0.4178




-0.41490
0.0203




-0.71257
<.0001




Weight


0.14351
0.4412




-0.24050
0.1925




1.00000
 




0.18152
0.3284




0.04397
0.8143




0.24938
0.1761




0.08974
0.6312




Run_Pulse


0.31365
0.0858




-0.31607
0.0832




0.18152
0.3284




1.00000
 




0.35246
0.0518




0.92975
<.0001




-0.02943
0.8751




Rest_Pulse


0.45038
0.0110




-0.15087
0.4178




0.04397
0.8143




0.35246
0.0518




1.00000
 




0.30512
0.0951




-0.22560
0.2224




Maximum_Pulse


0.22610
0.2213




-0.41490
0.0203




0.24938
0.1761




0.92975
<.0001




0.30512
0.0951




1.00000
 




0.09002
0.6301




Performance


-0.82049
<.0001




-0.71257
<.0001




0.08974
0.6312




-0.02943
0.8751




-0.22560
0.2224




0.09002
0.6301




1.00000
 








Correlation Matrix and Scatter Plot Matrix of Fitness Predictors 


The CORR Procedure



In [6]:

    
ods graphics on;
title "Computing Pearson Correlation Coefficients";
proc corr data=exercise nosimple rank
/*plots = matrix(nvar=all);*/
 plots(only)=scatter (ellipse = confidence);
/*plots(only) = scatter(ellipse = none);*/
var Rest_Pulse Max_Pulse Run_Pulse Age;
with Pushups;      /*****/
run;
ods graphics off;









    Out[6]:









  
  
  




66   ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
67   
68   ods graphics on;
69   title "Computing Pearson Correlation Coefficients";
70   proc corr data=exercise nosimple rank
ERROR: File WORK.EXERCISE.DATA does not exist.
71   /*plots = matrix(nvar=all);*/
72    plots(only)=scatter (ellipse = confidence);
73   /*plots(only) = scatter(ellipse = none);*/
74   var Rest_Pulse Max_Pulse Run_Pulse Age;
75   with Pushups;      /*****/
76   run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE CORR used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
77   ods graphics off;
78   ods html5 close;ods listing;

79

Simple Regression Model

See Ramon Littell, Walter Stroup, Rudolf Freund-SAS for Linear Models, Fourth Edition-SAS Publishing (2002) P11

The CLM option yields a confidence interval for the subpopulation mean, and the CLI option yields a prediction interval for a value to be drawn at random from the subpopulation. The CLI limits are always wider than the CLM limits, because the CLM limits accommodate only variability in $\widehat{y}$, whereas the CLI limits accommodate variability in $\widehat{y}$ and variability in the future value of y. This is true even though $\widehat{y}$ is used as an estimate of the subpopulation mean as well as a predictor of the future value.

where NOINT is the option that specifies that no intercept be included. In other words, the fitted regression plane is forced to pass through the origin.

Corresponding complications arise regarding the R-square statistic with no-intercept models. Note that R-Square=0.9829 for the no-intercept model in Output 2.10 is greater than R-Square=0.9373 for the model in Output 2.6, although the latter has two more parameters than the former. This seems contrary to the general phenomenon that adding terms to a model causes the R-square to increase. This seeming contradiction occurs because the denominator of the Rsquare is the Uncorrected Total SS when the NOINT option is used. This is the reason for the message that R-square is redefined at the top of Output 2.10. It is, therefore, not meaningful to compare an R-square for a model that contains an intercept with an R-square for a model that does not contain an intercept



In [7]:

    
proc reg data=statdata.fitness;
   model Oxygen_Consumption = RunTime / p cli clm influence r xpx i;
   id name RunTime;
   title 'Predicting Oxygen_Consumption from RunTime';
run;
quit;
title;









    Out[7]:










SAS Output





Predicting Oxygen_Consumption from RunTime 


The REG Procedure
Model: MODEL1








Model Crossproducts X'X X'Y Y'Y


Variable
Intercept
RunTime
Oxygen_Consumption




Intercept
31
328.17
1468.65


RunTime
328.17
3531.7975
15356.1247


Oxygen_Consumption
1468.65
15356.1247
70430.0327






Predicting Oxygen_Consumption from RunTime 


The REG Procedure
Model: MODEL1
Dependent Variable: Oxygen_Consumption









Number of Observations Read
31


Number of Observations Used
31










X'X Inverse, Parameter Estimates, and SSE


Variable
Intercept
RunTime
Oxygen_Consumption




Intercept
1.9728798928
-0.183317417
82.424942238


RunTime
-0.183317417
0.0173167563
-3.310854768


Oxygen_Consumption
82.424942238
-3.310854768
218.53997081










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
1
633.01458
633.01458
84.00
<.0001


Error
29
218.53997
7.53586
 
 


Corrected Total
30
851.55455
 
 
 










Root MSE
2.74515
R-Square
0.7434


Dependent Mean
47.37581
Adj R-Sq
0.7345


Coeff Var
5.79442
 
 










Parameter Estimates


Variable
DF
Parameter
Estimate

Standard
Error

t Value
Pr > |t|




Intercept
1
82.42494
3.85582
21.38
<.0001


RunTime
1
-3.31085
0.36124
-9.17
<.0001








Predicting Oxygen_Consumption from RunTime 


The REG Procedure
Model: MODEL1
Dependent Variable: Oxygen_Consumption









Output Statistics


Obs
Name
RunTime
Dependent
Variable

Predicted
Value

Std
Error
Mean
Predict



 
Residual
Std Error
Residual

Student
Residual

Cook's D
RStudent
Hat Diag
H

Cov
Ratio

DFFITS
DFBETAS


95% CL Mean
95% CL Predict
Intercept
RunTime




1
Donna
8.17
59.6
55.3753
1.0024
53.3250
57.4255
49.3982
61.3524
4.1947
2.556
1.641
0.207
1.6934
0.1333
1.0185
0.6643
0.6154
-0.5784


2
Gracie
8.63
60.1
53.8523
0.8616
52.0900
55.6145
47.9677
59.7368
6.2077
2.606
2.382
0.310
2.6094
0.0985
0.7700
0.8626
0.7647
-0.7074


3
Luanne
8.65
54.3
53.7860
0.8557
52.0359
55.5362
47.9051
59.6670
0.5140
2.608
0.197
0.002
0.1937
0.0972
1.1850
0.0636
0.0562
-0.0520


4
Mimi
8.92
54.6
52.8921
0.7780
51.3008
54.4834
47.0565
58.7277
1.7379
2.633
0.660
0.019
0.6536
0.0803
1.1316
0.1932
0.1639
-0.1494


5
Chris
8.95
49.2
52.7928
0.7697
51.2186
54.3670
46.9618
58.6238
-3.6328
2.635
-1.379
0.081
-1.4014
0.0786
1.0166
-0.4093
-0.3453
0.3143


6
Allen
9.22
49.9
51.8989
0.6976
50.4721
53.3256
46.1059
57.6918
-2.0289
2.655
-0.764
0.020
-0.7585
0.0646
1.1010
-0.1993
-0.1578
0.1410


7
Nancy
9.40
48.7
51.3029
0.6532
49.9669
52.6389
45.5317
57.0741
-2.6329
2.666
-0.987
0.029
-0.9870
0.0566
1.0619
-0.2418
-0.1807
0.1586


8
Patty
9.63
45.4
50.5414
0.6020
49.3102
51.7726
44.7935
56.2893
-5.1014
2.678
-1.905
0.092
-2.0009
0.0481
0.8626
-0.4497
-0.3030
0.2580


9
Suzanne
9.93
50.6
49.5482
0.5471
48.4293
50.6670
43.8233
55.2730
1.0018
2.690
0.372
0.003
0.3668
0.0397
1.1064
0.0746
0.0407
-0.0323


10
Teresa
10.00
46.7
49.3164
0.5366
48.2190
50.4138
43.5957
55.0371
-2.6464
2.692
-0.983
0.019
-0.9824
0.0382
1.0422
-0.1958
-0.0996
0.0773


11
Bob
10.07
45.3
49.0846
0.5271
48.0066
50.1627
43.3676
54.8017
-3.7746
2.694
-1.401
0.038
-1.4258
0.0369
0.9681
-0.2790
-0.1312
0.0987


12
Harriett
10.08
50.4
49.0515
0.5259
47.9760
50.1270
43.3350
54.7681
1.3385
2.694
0.497
0.005
0.4902
0.0367
1.0947
0.0957
0.0445
-0.0333


13
Jane
10.13
50.5
48.8860
0.5198
47.8228
49.9492
43.1717
54.6002
1.6540
2.695
0.614
0.007
0.6069
0.0359
1.0839
0.1170
0.0510
-0.0371


14
Harold
10.25
46.8
48.4887
0.5078
47.4502
49.5272
42.7790
54.1984
-1.7187
2.698
-0.637
0.007
-0.6304
0.0342
1.0798
-0.1187
-0.0429
0.0284


15
Sammy
10.33
51.9
48.2238
0.5017
47.1978
49.2498
42.5164
53.9313
3.6262
2.699
1.344
0.031
1.3633
0.0334
0.9759
0.2534
0.0782
-0.0467


16
Buffy
10.47
45.8
47.7603
0.4948
46.7483
48.7723
42.0553
53.4652
-1.9703
2.700
-0.730
0.009
-0.7237
0.0325
1.0684
-0.1326
-0.0280
0.0112


17
Trent
10.50
47.5
47.6610
0.4940
46.6506
48.6714
41.9563
53.3656
-0.1910
2.700
-0.071
0.000
-0.0695
0.0324
1.1082
-0.0127
-0.0024
0.0008


18
Jackie
10.60
47.3
47.3299
0.4931
46.3214
48.3383
41.6256
53.0342
-0.0599
2.701
-0.022
0.000
-0.0218
0.0323
1.1084
-0.0040
-0.0005
-0.0000


19
Ralph
10.85
49.1
46.5022
0.5022
45.4751
47.5292
40.7945
52.2098
2.5878
2.699
0.959
0.016
0.9575
0.0335
1.0406
0.1782
-0.0112
0.0338


20
Jack
10.95
40.8
46.1711
0.5103
45.1275
47.2147
40.4604
51.8817
-5.3311
2.697
-1.976
0.070
-2.0878
0.0346
0.8319
-0.3950
0.0521
-0.1017


21
Annie
11.08
45.1
45.7407
0.5243
44.6683
46.8130
40.0247
51.4566
-0.6207
2.695
-0.230
0.001
-0.2265
0.0365
1.1093
-0.0441
0.0096
-0.0150


22
Kate
11.12
44.8
45.6082
0.5294
44.5255
46.6910
39.8903
51.3262
-0.8582
2.694
-0.319
0.002
-0.3136
0.0372
1.1064
-0.0616
0.0149
-0.0225


23
Carl
11.17
46.1
45.4427
0.5363
44.3459
46.5395
39.7221
51.1633
0.6373
2.692
0.237
0.001
0.2328
0.0382
1.1110
0.0464
-0.0126
0.0182


24
Don
11.37
44.6
44.7805
0.5686
43.6177
45.9434
39.0469
50.5142
-0.1705
2.686
-0.063
0.000
-0.0624
0.0429
1.1205
-0.0132
0.0051
-0.0066


25
Effie
11.50
47.9
44.3501
0.5934
43.1366
45.5637
38.6060
50.0942
3.5699
2.680
1.332
0.043
1.3507
0.0467
0.9918
0.2990
-0.1332
0.1664


26
George
11.63
44.8
43.9197
0.6207
42.6502
45.1892
38.1635
49.6759
0.8903
2.674
0.333
0.003
0.3278
0.0511
1.1219
0.0761
-0.0381
0.0462


27
Iris
11.95
45.7
42.8602
0.6970
41.4347
44.2858
37.0676
48.6528
2.8198
2.655
1.062
0.039
1.0644
0.0645
1.0592
0.2794
-0.1706
0.1975


28
Mark
12.63
39.4
40.6088
0.8878
38.7930
42.4246
34.7081
46.5096
-1.1988
2.598
-0.462
0.012
-0.4552
0.1046
1.1805
-0.1556
0.1173
-0.1294


29
Steve
12.88
39.2
39.7811
0.9642
37.8091
41.7532
33.8304
45.7319
-0.5811
2.570
-0.226
0.004
-0.2224
0.1234
1.2194
-0.0834
0.0656
-0.0717


30
Vaughn
13.08
39.4
39.1190
1.0270
37.0185
41.2194
33.1245
45.1135
0.3210
2.546
0.126
0.001
0.1239
0.1400
1.2459
0.0500
-0.0404
0.0439


31
William
14.03
37.4
35.9736
1.3382
33.2367
38.7106
29.7276
42.2197
1.4164
2.397
0.591
0.054
0.5842
0.2376
1.3734
0.3261
-0.2853
0.3032















Sum of Residuals
0


Sum of Squared Residuals
218.53997


Predicted Residual SS (PRESS)
250.97516

The Model Sum of Squares is 633.01. This is the amount of variability that the model explains.

The Error Sum of Squares is 218.54. This is the amount of variability that the model does not explain.

The Total Sum of Squares is 851.55, which is the total amount of variability in the response.

The Mean Square column indicates the ratio of the sum of squares and the degrees of freedom.The mean square model is 633.01. This is calculated by dividing the model sum of squares by the model DF, which gives us the average sum of squares for the model. The mean square error is 7.54, which is an estimate of the population variance. This is calculated by dividing the error sum of squares by the error DF, which gives us the average sum of squares for the error.

The Root MSE is 2.75. This is the square root of the mean square error in the Analysis of Variance table. The Root MSE is a measure of the standard deviation of Oxygen_Consumption at each value of RunTime.

The Dependent Mean is 47.38, which is the average of Oxygen_Consumption for all 31 subjects.

The Coefficient of Variation is 5.79. This is the size of the standard deviation relative to the mean.

The R-square value is .743, which is calculated by dividing the mean square for the model by the total sum of squares. The R-square value is between 0 and 1 and measures the proportion of variation observed in the response that the regression line explains.

Mean Square Between and Mean Square Within are used to calculate the F-ratio:

If you create a 95% prediction interval, the interpretation is that you are 95% confident that your interval contains the new observation.

For a given set of data, why is a prediction interval wider than a confidence interval? A prediction interval is wider than a confidence interval because single observations have more variability than sample means.

The difference between a prediction interval and a confidence interval is the standard error.

The standard error for a confidence interval on the mean takes into account the uncertainty due to sampling. The line you computed from your sample will be different from the line that would have been computed if you had the entire population, the standard error takes this uncertainty into account.

The standard error for a prediction interval on an individual observation takes into account the uncertainty due to sampling like above, but also takes into account the variability of the individuals around the predicted mean. The standard error for the prediction interval will be wider than for the confidence interval and hence the prediction interval will be wider than the confidence interval.

Storing Parameter Estimates and Scoring

First, create a data set containing the values of the independent variable for which you want to make predictions. Concatenate the new data set with the original data set. Fit a simple linear regression model to the new data set and specify the P option in the MODEL statement. Because the concatenated observations contain missing values for the response variable, PROC REG doesn't include these observations when fitting the regression model. However, PROC REG does produce predicted values for these observations.

When you use a model to predict future values of the response variable given certain values of the predictor variable, you must stay within the range of values for the predictor variable used to create the model. For example, in the original Fitness data set, values of RunTime range from a little over 8 minutes to a little over 14 minutes. Based on that data, you shouldn't try to predict what Oxygen_Consumption would be for a RunTime value outside that range. The relationship between the predictor variable and the response variable might be different beyond the range of the data.

PROC SCORE DATA=SAS-data-set 
   SCORE=SAS-data-set 
   OUT=SAS-data-set 
   TYPE=name 
   <options>;
VAR variable(s);
RUN;
QUIT;

In the PROC SCORE statement, the DATA= option specifies the data set containing the observations to score, which is Need_Predictions. The SCORE= option specifies the data set containing the parameter estimates, which is Estimates. The OUT= option specifies the data set that PROC SCORE creates. Let's call this data set Scored. Finally, the TYPE= option tells PROC SCORE what type of data the SCORE= data set contains. In this case, specifying TYPE=PARMS tells SAS to use the parameter estimates in the Estimates data set. The VAR statement specifies the numeric variables to use in computing scores. These variables must appear in both the DATA= and SCORE= input data sets. If you don't specify a VAR statement, PROC SCORE uses all the numeric variables in the SCORE= data set. So it's important to specify a VAR statement with PROC SCORE, because you rarely use all the numeric variables in your data set to compute scores. We'll use RunTime. Next, let's see this process in action.



In [8]:

    
data need_predictions;
   input RunTime @@;
   datalines;
9 10 11 12 13 14 15
;
run;

proc reg data=statdata.fitness noprint outest=estimates; 
   model Oxygen_Consumption=RunTime;
run;
quit;
 
proc print data=estimates;
   title "OUTEST= Data Set from PROC REG";
run;
title;

proc print data = need_predictions;
 title "need_predictions Data Set";
run;
  
proc score data=need_predictions /*dataset to score*/ 
           score=estimates  /*dataset containing the parmeter estimates*/
           out=scored       /*the output dataset*/
           type=parms;      /*tells PROC SCORE what type of data the SCORE= data set contains.*/
   var RunTime; 
   /*The VAR statement specifies the numeric variables to use in computing scores. 
   These variables must appear in both the DATA= and SCORE= input data sets*/ 
run;
 
proc print data=Scored;
   title "Scored New Observations";
run;
title;









    Out[8]:










SAS Output







OUTEST= Data Set from PROC REG 







Obs
_MODEL_
_TYPE_
_DEPVAR_
_RMSE_
Intercept
RunTime
Oxygen_Consumption




1
MODEL1
PARMS
Oxygen_Consumption
2.74515
82.4249
-3.31085
-1








need_predictions Data Set 







Obs
RunTime




1
9


2
10


3
11


4
12


5
13


6
14


7
15








Scored New Observations 







Obs
RunTime
MODEL1




1
9
52.6272


2
10
49.3164


3
11
46.0055


4
12
42.6947


5
13
39.3838


6
14
36.0730


7
15
32.7621



In [9]:

    
proc reg data=statdata.bodyfat2 outest=estimates;
   model PctBodyFat2=Weight;
   title "Regression of % Body Fat on Weight";
run;

data toscore;
   input Weight @@;
   datalines;
125 150 175 200 225
;
run;

proc score data=toscore score=estimates
     out=scored type=parms;
   var Weight;
run;

proc print data=scored;
   title "Predicted % Body Fat from Weight 125 150 175 200 225";
run;
title;









    Out[9]:










SAS Output





Regression of % Body Fat on Weight 


The REG Procedure
Model: MODEL1
Dependent Variable: PctBodyFat2










Number of Observations Read
252


Number of Observations Used
252










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
1
6593.01614
6593.01614
150.03
<.0001


Error
250
10986
43.94389
 
 


Corrected Total
251
17579
 
 
 










Root MSE
6.62902
R-Square
0.3751


Dependent Mean
19.15079
Adj R-Sq
0.3726


Coeff Var
34.61485
 
 










Parameter Estimates


Variable
DF
Parameter
Estimate

Standard
Error

t Value
Pr > |t|




Intercept
1
-12.05158
2.58139
-4.67
<.0001


Weight
1
0.17439
0.01424
12.25
<.0001








Regression of % Body Fat on Weight 


The REG Procedure
Model: MODEL1
Dependent Variable: PctBodyFat2





























Predicted % Body Fat from Weight 125 150 175 200 225 







Obs
Weight
MODEL1




1
125
9.7470


2
150
14.1067


3
175
18.4664


4
200
22.8261


5
225
27.1859

The PLM Procedure

The PLM procedure performs post-fitting statistical analyses and plotting for the contents of a SAS item store that were previously created with the STORE statement in some other SAS/STAT procedure.

The statements that are available in the PLM procedure are designed to reveal the contents of the source item store via the Output Delivery System (ODS) and to perform post-fitting tasks.

The use of item stores and PROC PLM enables you to separate common post-processing tasks, such as testing for treatment differences and predicting new observations under a fitted model, from the process of model building and fitting. A numerically expensive model fitting technique can be applied once to produce a source item store. The PLM procedure can then be called multiple times, and the results of the fitted model are analyzed without incurring the model fitting expenditure again.

Selected PROC PLM option:

RESTORE specifies the source item store for processing.
Selected PROC PLM procedure statements:
EFFECTPLOT produces a display of the fitted model and provides options for changing and enhancing the displays.
LSMEANS computes and compares least squares means (LS-means) of fixed effects.
LSMESTIMATE provides custom hypothesis tests among least squares means.
SHOW uses the Output Delivery System to display contents of the item store. This statement is useful for verifying that the contents of the item store apply to the analysis and for generating ODS tables.
SLICE provides a general mechanism for performing a partitioned analysis of the LS-means for an interaction. This analysis is also known as an analysis of simple effects. The SLICE statement uses the same options as the LSMEANS statement.
WHERE is used in the PLM procedure when the item store contains BY-variable information and you want to apply the PROC PLM statements to only a subset of the BY groups.

PROC PLM RESTORE=item-store-specification<options>;
    EFFECTPLOT <plot-type <(plot-definition options)>> 
          </ options>;
    LSMEANS <model-effects > </ options>;
    LSMESTIMATE model-effect <'label'> values 
      <divisor=n><,...<'label'> values
      <divisor=n>> </ options>;
SHOW options;
SLICE model-effect </ options>;
WHERE expression ;
RUN;

Multiple Regression Model



In [10]:

    
proc sql outobs = 20; 
select *
from statdata.ameshousing3 ;
run;

proc univariate data=statdata.ameshousing3;
var SalePrice Basement_Area Lot_Area;
run;









    Out[10]:










SAS Output










PID
Lot size in square feet
Style of dwelling
Overall material and finish of the house
Overall condition of the house
Original construction year
Heating quality and condition
Presence of central air conditioning
Above grade (ground) living area square feet
Bedrooms above grade
Number of fireplaces
Size of garage in square feet
Month Sold (MM)
Year Sold (YYYY)
Sale price in dollars
Basement area in square feet
Number of full bathrooms
Number of half bathrooms
Total number of bathrooms (half bathrooms counted 10%)
Total area of decks and porches in square feet
Age of house when sold, in years
Season when house sold
Garage attached or detached
Foundation Type
Masonry veneer or not
Regular or irregular lot shape
Style of dwelling
Overall material and finish of the house
Overall condition of the house
Natural log of the sale price
Sale Price > $175,000
score




0527127150
4920
1Story
8
5
2001
Ex
Y
1338
2
0
582
4
2010
213500
1338
3
0
3
0
9
2
Attached
Concrete/Slab
N
Regular
1Story
6
5
12.271392112
1
.


0527145080
5005
1Story
8
5
1992
Ex
Y
1280
2
0
506
1
2010
191500
1280
2
0
2
226
18
1
Attached
Concrete/Slab
N
Irregular
1Story
6
5
12.162643088
1
.


0527425090
10500
1Story
4
5
1971
TA
Y
864
3
1
0
4
2010
115000
864
1
0
1
0
39
2
NA
Cinder Block
N
Regular
1Story
4
5
11.652687407
0
.


0528228285
3203
1Story
7
5
2006
Ex
Y
1145
2
0
437
1
2010
160000
1145
2
0
2
216
4
1
Attached
Concrete/Slab
Y
Regular
1Story
6
5
11.982929094
0
.


0528250100
7750
SLvl
7
5
2000
Ex
Y
1430
3
1
400
4
2010
180000
384
2
1
2.1
180
10
2
Attached
Concrete/Slab
N
Irregular
SLvl
6
5
12.10071213
1
.


0531452050
7175
1Story
6
5
1984
TA
Y
752
2
0
264
2
2010
125000
744
2
0
2
443
26
1
Attached
Cinder Block
N
Regular
1Story
6
5
11.736069016
0
.


0533253210
3880
1Story
8
6
1978
TA
Y
1226
1
1
484
1
2010
206000
1226
2
0
2
301
32
1
Attached
Cinder Block
N
Irregular
1Story
6
6
12.235631448
1
.


0534401110
9900
1Story
5
5
1966
Gd
Y
1209
3
0
504
4
2010
159000
1209
2
0
2
0
44
2
Attached
Concrete/Slab
N
Regular
1Story
5
5
11.976659481
0
.


0534403410
14112
SLvl
5
7
1964
TA
Y
1152
3
1
484
4
2010
180500
1152
2
0
2
227
46
2
Attached
Concrete/Slab
Y
Irregular
SLvl
5
6
12.103486057
1
.


0534430080
9717
1Story
5
6
1950
Gd
Y
1078
2
0
240
4
2010
142125
1078
2
0
2
366
60
2
Attached
Cinder Block
N
Regular
1Story
5
6
11.864462231
0
.


0534475100
9920
1Story
5
5
1954
TA
Y
1063
3
0
280
2
2010
128000
1056
2
0
2
0
56
1
Attached
Cinder Block
Y
Regular
1Story
5
5
11.759785543
0
.


0534479320
7800
1Story
5
6
1954
Gd
Y
1268
2
1
244
3
2010
132000
1268
1
0
1
98
56
2
Attached
Concrete/Slab
Y
Regular
1Story
5
6
11.790557202
0
.


0535101020
11380
SFoyer
6
8
1966
Gd
Y
1128
2
1
315
1
2010
178000
1080
2
0
2
238
44
1
Attached
Cinder Block
Y
Irregular
SFoyer
6
6
12.089538829
1
.


0535302080
10950
1Story
5
7
1952
TA
Y
1064
2
0
318
5
2010
135000
864
1
1
1.1
0
58
2
Detached
Cinder Block
N
Regular
1Story
5
6
11.813030057
0
.


0535453070
7500
1Story
5
7
1959
Ex
Y
1246
3
0
305
5
2010
154000
1246
2
1
2.1
218
51
2
Attached
Cinder Block
N
Regular
1Story
5
6
11.944707881
0
.


0535456110
7200
1Story
5
7
1951
TA
Y
900
3
0
576
5
2010
134800
900
1
1
1.1
254
59
2
Detached
Cinder Block
N
Regular
1Story
5
6
11.811547477
0
.


0535476350
9760
1Story
6
7
1963
TA
Y
1395
2
1
440
5
2010
192000
1395
2
0
2
897
47
2
Attached
Cinder Block
Y
Regular
1Story
6
6
12.165250651
1
.


0902125160
4608
1Story
4
6
1945
TA
Y
747
2
0
220
6
2010
80000
747
1
0
1
0
65
3
Attached
Cinder Block
N
Regular
1Story
4
6
11.289781914
0
.


0902206130
6900
1.5Fin
6
7
1938
Gd
Y
1251
3
0
240
1
2010
119000
827
1
0
1
0
72
1
Detached
Concrete/Slab
N
Regular
1.5Fin
6
6
11.686878772
0
.


0903232190
6240
1.5Fin
5
7
1936
Gd
Y
1040
2
0
624
5
2010
123900
528
1
0
1
306
74
2
Detached
Cinder Block
N
Regular
1.5Fin
5
6
11.727230068
0
.








The UNIVARIATE Procedure
Variable:  SalePrice  (Sale price in dollars)








Moments




N
300
Sum Weights
300


Mean
137524.867
Sum Observations
41257460


Std Deviation
37622.6431
Variance
1415463276


Skewness
0.29726388
Kurtosis
0.72287774


Uncorrected SS
6.09715E12
Corrected SS
4.23224E11


Coeff Variation
27.3569748
Std Error Mean
2172.14431










Basic Statistical Measures


Location
Variability




Mean
137524.9
Std Deviation
37623


Median
135000.0
Variance
1415463276


Mode
110000.0
Range
255000


 
 
Interquartile Range
45475




Note: The mode displayed is the smallest of 2 modes with a count of 6.








Tests for Location: Mu0=0


Test
Statistic
p Value




Student's t
t
63.31295
Pr > |t|
<.0001


Sign
M
150
Pr >= |M|
<.0001


Signed Rank
S
22575
Pr >= |S|
<.0001










Quantiles (Definition 5)


Level
Quantile




100% Max
290000


99%
227500


95%
207000


90%
187300


75% Q3
159475


50% Median
135000


25% Q1
114000


10%
91150


5%
80000


1%
48500


0% Min
35000










Extreme Observations


Lowest
Highest


Value
Obs
Value
Obs




35000
294
218000
184


39300
190
220000
106


45000
77
235000
151


52000
130
245000
54


59000
70
290000
123







The UNIVARIATE Procedure
Variable:  Basement_Area  (Basement area in square feet)








Moments




N
300
Sum Weights
300


Mean
882.31
Sum Observations
264693


Std Deviation
359.783966
Variance
129444.502


Skewness
-0.5476589
Kurtosis
0.13741949


Uncorrected SS
272245187
Corrected SS
38703906.2


Coeff Variation
40.7775007
Std Error Mean
20.772137










Basic Statistical Measures


Location
Variability




Mean
882.3100
Std Deviation
359.78397


Median
912.0000
Variance
129445


Mode
0.0000
Range
1645


 
 
Interquartile Range
471.50000










Tests for Location: Mu0=0


Test
Statistic
p Value




Student's t
t
42.47565
Pr > |t|
<.0001


Sign
M
142
Pr >= |M|
<.0001


Signed Rank
S
20235
Pr >= |S|
<.0001










Quantiles (Definition 5)


Level
Quantile




100% Max
1645.0


99%
1488.0


95%
1430.5


90%
1337.5


75% Q3
1143.5


50% Median
912.0


25% Q1
672.0


10%
406.0


5%
0.0


1%
0.0


0% Min
0.0










Extreme Observations


Lowest
Highest


Value
Obs
Value
Obs




0
285
1486
95


0
269
1487
249


0
268
1489
222


0
233
1602
151


0
219
1645
105







The UNIVARIATE Procedure
Variable:  Lot_Area  (Lot size in square feet)








Moments




N
300
Sum Weights
300


Mean
8294.13667
Sum Observations
2488241


Std Deviation
3323.78787
Variance
11047565.8


Skewness
1.00934511
Kurtosis
4.57577642


Uncorrected SS
2.3941E10
Corrected SS
3303222171


Coeff Variation
40.0739462
Std Error Mean
191.898982










Basic Statistical Measures


Location
Variability




Mean
8294.137
Std Deviation
3324


Median
8265.000
Variance
11047566


Mode
7200.000
Range
24647


 
 
Interquartile Range
3816










Tests for Location: Mu0=0


Test
Statistic
p Value




Student's t
t
43.22137
Pr > |t|
<.0001


Sign
M
150
Pr >= |M|
<.0001


Signed Rank
S
22575
Pr >= |S|
<.0001










Quantiles (Definition 5)


Level
Quantile




100% Max
26142.0


99%
18631.5


95%
13109.0


90%
12036.0


75% Q3
10110.0


50% Median
8265.0


25% Q1
6294.5


10%
4252.0


5%
2956.5


1%
1638.0


0% Min
1495.0










Extreme Observations


Lowest
Highest


Value
Obs
Value
Obs




1495
241
16285
91


1533
110
17755
292


1596
173
19508
120


1680
252
25339
218


1680
251
26142
38

Run the same model in PROC GLM. When you run a linear regression model with only two predictor variables, the output includes a contour fit plot by default. We specify CONTOURFIT to tell SAS to overlay the contour plot with a scatter plot of the observed data.

Here is the contour fit plot with the overlaid scatter plot that we requested. We can use this plot to see how well your model predicts observed values. The plot shows predicted values of SalePrice as gradations of the background color from blue, representing low values, to red, representing high values. The dots, which are similarly colored, represent the actual data. Observations that are perfectly fit would show the same color within the circle as outside the circle. The lines on the graph help you read the actual predictions at even intervals.

For example, this point near the upper-right represents an observation with a basement area of about 1,500 square feet, a lot size of about 17,000 square feet, and a predicted value of over \$180,000 for sale price. However, the dot’s color shows that its observed sale price is actually closer to about \$160,000.



In [11]:

    
ods graphics on;

proc reg data=statdata.ameshousing3 ;
    model SalePrice=Basement_Area Lot_Area;
    title "Model with Basement Area and Lot Area";
run;
quit;

proc glm data=statdata.ameshousing3 
         plots(only)=(contourfit);
    model SalePrice=Basement_Area Lot_Area;
    contrast 'Basement_Area=0' Basement_Area 1; 
    contrast 'Basement_Area=Lot_Area' Basement_Area 1 Lot_Area -1;
    contrast 'Basement_Area=Lot_Area=0' Basement_Area 1,  Lot_Area 1;
    
    /*CONTRAST statements can be used to test hypotheses about
    any linear combination of parameters in the model.*/
    
   estimate 'Basement_Area=0' Basement_Area 1; 
   estimate 'Basement_Area=Lot_Area' Basement_Area 1 Lot_Area -1; 
    
    /*The ESTIMATE statement is used in essentially the same way as the CONTRAST statement. 
But instead of F-tests for linear combinations, you get estimates of them along with standard errors.
However, the ESTIMATE statement can estimate only one linear combination at a time, whereas the
CONTRAST statement could be used to test two or more linear combinations simultaneously */
    
    store out=multiple;
    title "Model with Basement Area and Gross Living Area";
run;
quit;

proc plm restore=multiple plots=all;
    effectplot contour (y=Basement_Area x=Lot_Area);
    effectplot slicefit(x=Lot_Area sliceby=Basement_Area=250 to 1000 by 250);
run; 

title;









    Out[11]:










SAS Output





Model with Basement Area and Lot Area 


The REG Procedure
Model: MODEL1
Dependent Variable: SalePrice Sale price in dollars










Number of Observations Read
300


Number of Observations Used
300










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
2
2.032206E11
1.016103E11
137.17
<.0001


Error
297
2.200029E11
740750509
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
27217
R-Square
0.4802


Dependent Mean
137525
Adj R-Sq
0.4767


Coeff Var
19.79041
 
 










Parameter Estimates


Variable
Label
DF
Parameter
Estimate

Standard
Error

t Value
Pr > |t|




Intercept
Intercept
1
69016
5129.52179
13.45
<.0001


Basement_Area
Basement area in square feet
1
70.08680
4.54618
15.42
<.0001


Lot_Area
Lot size in square feet
1
0.80430
0.49210
1.63
0.1032








Model with Basement Area and Lot Area 


The REG Procedure
Model: MODEL1
Dependent Variable: SalePrice Sale price in dollars
























Model with Basement Area and Gross Living Area 


The GLM Procedure








Number of Observations Read
300


Number of Observations Used
300







Model with Basement Area and Gross Living Area 


The GLM Procedure

Dependent Variable: SalePrice   Sale price in dollars









Source
DF
Sum of Squares
Mean Square
F Value
Pr > F




Model
2
203220618262
101610309131
137.17
<.0001


Error
297
220002901249
740750509.26
 
 


Corrected Total
299
423223519511
 
 
 










R-Square
Coeff Var
Root MSE
SalePrice Mean




0.480173
19.79041
27216.73
137524.9










Source
DF
Type I SS
Mean Square
F Value
Pr > F




Basement_Area
1
201241844480
201241844480
271.67
<.0001


Lot_Area
1
1978773781.7
1978773781.7
2.67
0.1032










Source
DF
Type III SS
Mean Square
F Value
Pr > F




Basement_Area
1
176055907089
176055907089
237.67
<.0001


Lot_Area
1
1978773781.7
1978773781.7
2.67
0.1032










Contrast
DF
Contrast SS
Mean Square
F Value
Pr > F




Basement_Area=0
1
176055907089
176055907089
237.67
<.0001


Basement_Area=Lot_Area
1
160693644810
160693644810
216.93
<.0001


Basement_Area=Lot_Area=0
2
203220618262
101610309131
137.17
<.0001










Parameter
Estimate
Standard
Error

t Value
Pr > |t|




Basement_Area=0
70.0868031
4.54618316
15.42
<.0001


Basement_Area=Lot_Area
69.2825041
4.70392301
14.73
<.0001










Parameter
Estimate
Standard
Error

t Value
Pr > |t|




Intercept
69015.61360
5129.521790
13.45
<.0001


Basement_Area
70.08680
4.546183
15.42
<.0001


Lot_Area
0.80430
0.492102
1.63
0.1032















Model with Basement Area and Gross Living Area 


The PLM Procedure







Store Information




Item Store
WORK.MULTIPLE


Data Set Created From
STATDATA.AMESHOUSING3


Created By
PROC GLM


Date Created
03JUL17:23:17:52


Response Variable
SalePrice


Model Effects
Intercept Basement_Area Lot_Area

Model Selection

Automatic Model Selection

In the MODEL statement, following a forward slash, you add the SELECTION= option to specify the method used to select the model. The default is NONE, which in this case would calculate the full regression model, because you specified all the variables in the MODEL statement. To calculate the all-possible regression model instead, you specify the CP, RSQUARE, or ADJRSQ statistics as the SELECTION= value. Here all three are specified. The first statistic that you list here determines the sorting order in the output.

Here's a question: For this PROC REG step, how are the models sorted? Specifying CP as the first statistic sorts the models by the value of CP. To produce only a specific number of models, you can specify the BEST= option in the MODEL statement. For example, BEST=20 displays the 20 best models based on your sorting statistic, which in this case is CP.

Finally, you can add an optional label to the MODEL statement to label your output. For this all-possible regression model, let's add the label ALL_REG. Notice that the label must end in a colon.

Each star in the Cp plot represents the best model for a given number of parameters



In [12]:

    
ods graphics / imagemap=on;
proc reg data=statdata.fitness plots(only)=(cp);
   ALL_REG: model Oxygen_Consumption= 
   Performance RunTime Age Weight
   Run_Pulse Rest_Pulse Maximum_Pulse
   / selection=cp rsquare adjrsq best=20;
title 'Best Models Using All-Regression Option';
run;
quit;
title;









    Out[12]:










SAS Output





Best Models Using All-Regression Option 


The REG Procedure
Model: ALL_REG
Dependent Variable: Oxygen_Consumption

C(p) Selection Method










Number of Observations Read
31


Number of Observations Used
31













Model
Index

Number in
Model

C(p)
R-Square
Adjusted
R-Square

Variables in Model




1
4
4.0004
0.8355
0.8102
RunTime Age Run_Pulse Maximum_Pulse


2
5
4.2598
0.8469
0.8163
RunTime Age Weight Run_Pulse Maximum_Pulse


3
5
4.7158
0.8439
0.8127
Performance RunTime Weight Run_Pulse Maximum_Pulse


4
5
4.7168
0.8439
0.8127
Performance RunTime Age Run_Pulse Maximum_Pulse


5
4
4.9567
0.8292
0.8029
Performance RunTime Run_Pulse Maximum_Pulse


6
3
5.8570
0.8101
0.7890
RunTime Run_Pulse Maximum_Pulse


7
3
5.9367
0.8096
0.7884
RunTime Age Run_Pulse


8
5
5.9783
0.8356
0.8027
RunTime Age Run_Pulse Rest_Pulse Maximum_Pulse


9
5
5.9856
0.8356
0.8027
Performance Age Weight Run_Pulse Maximum_Pulse


10
6
6.0492
0.8483
0.8104
Performance RunTime Age Weight Run_Pulse Maximum_Pulse


11
6
6.1758
0.8475
0.8094
RunTime Age Weight Run_Pulse Rest_Pulse Maximum_Pulse


12
6
6.6171
0.8446
0.8057
Performance RunTime Weight Run_Pulse Rest_Pulse Maximum_Pulse


13
6
6.7111
0.8440
0.8049
Performance RunTime Age Run_Pulse Rest_Pulse Maximum_Pulse


14
4
6.8865
0.8165
0.7882
Performance RunTime Age Run_Pulse


15
5
6.9446
0.8293
0.7951
Performance RunTime Run_Pulse Rest_Pulse Maximum_Pulse


16
4
6.9623
0.8160
0.7877
RunTime Weight Run_Pulse Maximum_Pulse


17
4
7.0752
0.8152
0.7868
RunTime Age Weight Run_Pulse


18
3
7.1734
0.8014
0.7794
Performance RunTime Run_Pulse


19
6
7.7279
0.8373
0.7966
Performance Age Weight Run_Pulse Rest_Pulse Maximum_Pulse


20
4
7.7942
0.8105
0.7814
RunTime Run_Pulse Rest_Pulse Maximum_Pulse








 



Best Models Using All-Regression Option 


The REG Procedure
Model: ALL_REG
Dependent Variable: Oxygen_Consumption

C(p) Selection Method



In [13]:

    
proc reg data=statdata.fitness;
   PREDICT_mpc: model Oxygen_Consumption= 
                  RunTime Age Run_Pulse Maximum_Pulse; 
   EXPLAIN_hcp: model Oxygen_Consumption= 
                  RunTime Age Weight Run_Pulse Maximum_Pulse; 
   title 'Check "Best" Two Candidate Models';
run;
quit;
title;









    Out[13]:










SAS Output





Check "Best" Two Candidate Models 


The REG Procedure
Model: PREDICT_mpc
Dependent Variable: Oxygen_Consumption










Number of Observations Read
31


Number of Observations Used
31










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
4
711.45087
177.86272
33.01
<.0001


Error
26
140.10368
5.38860
 
 


Corrected Total
30
851.55455
 
 
 










Root MSE
2.32134
R-Square
0.8355


Dependent Mean
47.37581
Adj R-Sq
0.8102


Coeff Var
4.89984
 
 










Parameter Estimates


Variable
DF
Parameter
Estimate

Standard
Error
t Value
Pr > |t|




Intercept
1
97.16952
11.65703
8.34
<.0001


RunTime
1
-2.77576
0.34159
-8.13
<.0001


Age
1
-0.18903
0.09439
-2.00
0.0557


Run_Pulse
1
-0.34568
0.11820
-2.92
0.0071


Maximum_Pulse
1
0.27188
0.13438
2.02
0.0534








Check "Best" Two Candidate Models 


The REG Procedure
Model: PREDICT_mpc
Dependent Variable: Oxygen_Consumption
















































































































































































































































































































































































































































































































































































































































































































































































Check "Best" Two Candidate Models 


The REG Procedure
Model: EXPLAIN_hcp
Dependent Variable: Oxygen_Consumption










Number of Observations Read
31


Number of Observations Used
31










Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
5
721.20532
144.24106
27.66
<.0001


Error
25
130.34923
5.21397
 
 


Corrected Total
30
851.55455
 
 
 










Root MSE
2.28341
R-Square
0.8469


Dependent Mean
47.37581
Adj R-Sq
0.8163


Coeff Var
4.81978
 
 










Parameter Estimates


Variable
DF
Parameter
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
101.33835
11.86474
8.54
<.0001


RunTime
1
-2.68846
0.34202
-7.86
<.0001


Age
1
-0.21217
0.09437
-2.25
0.0336


Weight
1
-0.07332
0.05360
-1.37
0.1836


Run_Pulse
1
-0.37071
0.11770
-3.15
0.0042


Maximum_Pulse
1
0.30603
0.13452
2.28
0.0317








Check "Best" Two Candidate Models 


The REG Procedure
Model: EXPLAIN_hcp
Dependent Variable: Oxygen_Consumption

Stepwise selection methods

Stepwise selection methods include forward, backward, and stepwise approaches. In this course, you use these methods to select variables based on their p-values, and we will discuss other methods as well. Let's look at each of these three methods in detail.

Forward selection starts with no predictor variables in the model. It selects the best one-variable model (the most significant variable when run by itself). Then it selects the best two-variable model that includes the variable in the first model. So, after a variable is added to the model, it stays in, even if it becomes insignificant later. Forward selection keeps adding variables, one at a time, until no significant terms are left to add.

Backward selection, also called backward elimination, starts with all predictor variables in the model. It removes variables one at a time, starting with the most non-significant variable. After a variable is removed from the model, it cannot reenter. Backward selection stops when only significant terms are left in the model.

Using automated model selection results in biases in parameter estimates, predictions, and standard errors, incorrect calculation of degrees of freedom, and p-values that tend to err on the side of overestimating significance.

So, how can you avoid these issues? One way is to hold out some of your data in order to perform an honest assessment of how well your model performs on a different sample of data than you used to develop the model. You split your data into two data sets: the training data and the holdout data, which is also called the validation data. You use the training data to build your model, and you use the holdout data to assess and compare potential models.

Other honest assessment approaches include cross-validation or bootstrapping. You might choose to perform cross-validation if your data set isn’t large enough to split and hold out some data for validation. Alternatively, you can use a bootstrapping method to obtain correct standard errors and p-values. Bootstrapping is a resampling method that tries to approximate the distribution of the parameter estimates to estimate the standard error.

One last thing to keep in mind is that the stepwise techniques don’t take any any collinearity in your model into account. Collinearity means that predictor variables in the same model are highly correlated. If collinearity is present in your model, you might want to consider first reducing the collinearity as much as possible and then running stepwise methods on the remaining variables.



In [14]:

    
%let interval=Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area 
              Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom ;

ods graphics on;

proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISE: model SalePrice=&interval / selection=stepwise showpvales
                   details=steps select=SL slstay=0.05 slentry=0.05;
   title "Stepwise Model Selection for SalePrice - SL 0.05";
run;

/*Optional code that will execute forward and backward selection, each with slentry and slstay = 0.05.
proc glmselect data=statdata.ameshousing3 plots=all;
   FORWARD: model SalePrice=&interval / selection=forward details=steps select=SL slentry=0.05;
   title "Forward Model Selection for SalePrice - SL 0.05";
run;

proc glmselect data=statdata.ameshousing3 plots=all;
   BACKWARD: model SalePrice=&interval / selection=backward details=steps select=SL slstay=0.05;
   title "Backward Model Selection for SalePrice - SL 0.05";
run;
*/









    Out[14]:










SAS Output





Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure







Data Set
STATDATA.AMESHOUSING3


Dependent Variable
SalePrice


Selection Method
Stepwise


Select Criterion
Significance Level


Stop Criterion
Significance Level


Entry Significance Level (SLE)
0.05


Stay Significance Level (SLS)
0.05


Effect Hierarchy Enforced
None










Number of Observations Read
300


Number of Observations Used
300










Dimensions




Number of Effects
9


Number of Parameters
9








Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 0


Effect Entered: Intercept







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
0
0
.
.
.


Error
299
4.232235E11
1415463276
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
37623


Dependent Mean
137525


R-Square
0.0000


Adj R-Sq
0.0000


AIC
6624.21515


AICC
6624.25555


SBC
6325.91893










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
137525
2172.144314
63.31
<.0001











Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 1


Effect Entered: Basement_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
1
2.012418E11
2.012418E11
270.16
<.0001


Error
298
2.219817E11
744904950
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
27293


Dependent Mean
137525


R-Square
0.4755


Adj R-Sq
0.4737


AIC
6432.62346


AICC
6432.70454


SBC
6138.03102










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
73904
4179.193780
17.68
<.0001


Basement_Area
1
72.107717
4.387055
16.44
<.0001










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Basement_Area
-98.8577
<.0001


2
Gr_Liv_Area
-84.6132
<.0001


3
Age_Sold
-73.5219
<.0001


4
Total_Bathroom
-69.1880
<.0001


5
Garage_Area
-63.3558
<.0001


6
Deck_Porch_Area
-34.3105
<.0001


7
Lot_Area
-11.6303
<.0001


8
Bedroom_AbvGr
-5.5339
0.0040


























Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 2


Effect Entered: Gr_Liv_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
2
2.64483E11
1.322415E11
247.42
<.0001


Error
297
1.587405E11
534479711
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
23119


Dependent Mean
137525


R-Square
0.6249


Adj R-Sq
0.6224


AIC
6334.02620


AICC
6334.16179


SBC
6043.13755










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
12664
6650.339855
1.90
0.0578


Gr_Liv_Area
1
69.606974
6.399091
10.88
<.0001


Basement_Area
1
52.309702
4.137885
12.64
<.0001










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Gr_Liv_Area
-52.2496
<.0001


2
Age_Sold
-48.2636
<.0001


3
Garage_Area
-43.6174
<.0001


4
Total_Bathroom
-31.0375
<.0001


5
Deck_Porch_Area
-16.3568
<.0001


6
Lot_Area
-2.2708
0.1032


7
Bedroom_AbvGr
-0.7570
0.4691

























Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 3


Effect Entered: Age_Sold







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
3
3.207148E11
1.069049E11
308.69
<.0001


Error
296
1.025087E11
346313132
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
18609


Dependent Mean
137525


R-Square
0.7578


Adj R-Sq
0.7553


AIC
6204.82927


AICC
6205.03335


SBC
5917.64440










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
53400
6235.076995
8.56
<.0001


Gr_Liv_Area
1
68.106646
5.152294
13.22
<.0001


Basement_Area
1
36.329120
3.559067
10.21
<.0001


Age_Sold
1
-543.493346
42.651840
-12.74
<.0001










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Age_Sold
-67.2828
<.0001


2
Garage_Area
-37.5122
<.0001


3
Total_Bathroom
-21.6266
<.0001


4
Deck_Porch_Area
-12.5097
<.0001


5
Bedroom_AbvGr
-12.4446
<.0001


6
Lot_Area
-0.4524
0.6361
























Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 4


Effect Entered: Garage_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
4
3.33571E11
83392754480
274.40
<.0001


Error
295
89652501590
303906785
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
17433


Dependent Mean
137525


R-Square
0.7882


Adj R-Sq
0.7853


AIC
6166.62734


AICC
6166.91403


SBC
5883.14625










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
43815
6023.907004
7.27
<.0001


Gr_Liv_Area
1
61.238136
4.940722
12.39
<.0001


Basement_Area
1
33.430181
3.363709
9.94
<.0001


Garage_Area
1
42.984492
6.608851
6.50
<.0001


Age_Sold
1
-455.704354
42.173481
-10.81
<.0001










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Garage_Area
-21.8203
<.0001


2
Deck_Porch_Area
-12.9294
<.0001


3
Bedroom_AbvGr
-7.8057
0.0004


4
Total_Bathroom
-3.8856
0.0205


5
Lot_Area
-3.6980
0.0248























Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 5


Effect Entered: Deck_Porch_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
5
3.392788E11
67855752389
237.65
<.0001


Error
294
83944757568
285526386
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
16898


Dependent Mean
137525


R-Square
0.8017


Adj R-Sq
0.7983


AIC
6148.89269


AICC
6149.27625


SBC
5869.11538










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
46009
5859.485517
7.85
<.0001


Gr_Liv_Area
1
58.386514
4.831268
12.09
<.0001


Basement_Area
1
30.554240
3.323249
9.19
<.0001


Garage_Area
1
40.158112
6.436997
6.24
<.0001


Deck_Porch_Area
1
35.720258
7.989240
4.47
<.0001


Age_Sold
1
-447.254040
40.921927
-10.93
<.0001










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Deck_Porch_Area
-11.4060
<.0001


2
Bedroom_AbvGr
-6.2737
0.0019


3
Total_Bathroom
-3.1045
0.0448


4
Lot_Area
-1.9476
0.1426






















Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 6


Effect Entered: Bedroom_AbvGr







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
6
3.410749E11
56845818595
202.75
<.0001


Error
293
82148607939
280370676
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
48620
5897.324643
8.24
<.0001


Gr_Liv_Area
1
65.097413
5.472624
11.90
<.0001


Basement_Area
1
31.279351
3.305546
9.46
<.0001


Garage_Area
1
38.728785
6.403565
6.05
<.0001


Deck_Porch_Area
1
32.487956
8.019119
4.05
<.0001


Age_Sold
1
-434.199118
40.877494
-10.62
<.0001


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53
0.0119










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Bedroom_AbvGr
-4.4317
0.0119


2
Total_Bathroom
-2.5664
0.0768


3
Lot_Area
-2.1476
0.1168





















Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Stepwise Selection: Step 7


Effect Entered: Lot_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
7
3.424508E11
48921543221
176.86
<.0001


Error
292
80772716963
276618894
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
16632


Dependent Mean
137525


R-Square
0.8091


Adj R-Sq
0.8046


AIC
6141.33678


AICC
6141.95747


SBC
5868.96704










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
47463
5880.674041
8.07
<.0001


Gr_Liv_Area
1
65.303724
5.436672
12.01
<.0001


Basement_Area
1
29.849078
3.345400
8.92
<.0001


Garage_Area
1
36.309606
6.452405
5.63
<.0001


Deck_Porch_Area
1
32.052554
7.967677
4.02
<.0001


Lot_Area
1
0.708127
0.317512
2.23
0.0265


Age_Sold
1
-447.198682
41.019314
-10.90
<.0001


Bedroom_AbvGr
1
-5042.766498
1687.928168
-2.99
0.0031










Entry Candidates


Rank
Effect
Log
pValue
Pr > F




1
Lot_Area
-3.6309
0.0265


2
Total_Bathroom
-2.2036
0.1104




















Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure








Stepwise Selection Summary


Step
Effect
Entered
Effect
Removed
Number
Effects In
F Value
Pr > F




0
Intercept
 
1
0.00
1.0000


1
Basement_Area
 
2
270.16
<.0001


2
Gr_Liv_Area
 
3
118.32
<.0001


3
Age_Sold
 
4
162.37
<.0001


4
Garage_Area
 
5
42.30
<.0001


5
Deck_Porch_Area
 
6
19.99
<.0001


6
Bedroom_AbvGr
 
7
6.41
0.0119


7
Lot_Area
 
8
4.97
0.0265










Selection stopped because the candidate for entry has SLE > 0.05 and the candidate for removal has SLS < 0.05.










Stop Details


Candidate
For
Effect
Candidate
Significance
  
Compare
Significance
  




Entry
Total_Bathroom
0.1167
>
0.0500
(SLE)


Removal
Lot_Area
0.0265
<
0.0500
(SLS)





































































































































































































































































Stepwise Model Selection for SalePrice - SL 0.05 


The GLMSELECT Procedure
Selected Model


The selected model is the model at the last step (Step 7).







Effects:
Intercept Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area Lot_Area Age_Sold Bedroom_AbvGr





Note:The p-values for parameters and effects are not adjusted for the fact that the terms in the model have been selected and so are generally liberal.







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F




Model
7
3.424508E11
48921543221
176.86
<.0001


Error
292
80772716963
276618894
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
16632


Dependent Mean
137525


R-Square
0.8091


Adj R-Sq
0.8046


AIC
6141.33678


AICC
6141.95747


SBC
5868.96704










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value
Pr > |t|




Intercept
1
47463
5880.674041
8.07
<.0001


Gr_Liv_Area
1
65.303724
5.436672
12.01
<.0001


Basement_Area
1
29.849078
3.345400
8.92
<.0001


Garage_Area
1
36.309606
6.452405
5.63
<.0001


Deck_Porch_Area
1
32.052554
7.967677
4.02
<.0001


Lot_Area
1
0.708127
0.317512
2.23
0.0265


Age_Sold
1
-447.198682
41.019314
-10.90
<.0001


Bedroom_AbvGr
1
-5042.766498
1687.928168
-2.99
0.0031



In [15]:

    
%let interval=Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area 
              Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom ;

ods graphics on;
proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISEAIC: model SalePrice = &interval / selection=stepwise details=steps select=AIC;
   title "Stepwise Model Selection for SalePrice - AIC";
run;

proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISEBIC: model SalePrice = &interval / selection=stepwise details=steps select=BIC;
   title "Stepwise Model Selection for SalePrice - BIC";
run;

proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISEAICC: model SalePrice = &interval / selection=stepwise details=steps select=AICC;
   title "Stepwise Model Selection for SalePrice - AICC";
run;

proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISESBC: model SalePrice = &interval / selection=stepwise details=steps select=SBC;
   title "Stepwise Model Selection for SalePrice - SBC";
run;









    Out[15]:










SAS Output





Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure







Data Set
STATDATA.AMESHOUSING3


Dependent Variable
SalePrice


Selection Method
Stepwise


Select Criterion
AIC


Stop Criterion
AIC


Effect Hierarchy Enforced
None










Number of Observations Read
300


Number of Observations Used
300










Dimensions




Number of Effects
9


Number of Parameters
9








Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 0


Effect Entered: Intercept







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
0
0
.
.


Error
299
4.232235E11
1415463276
 


Corrected Total
299
4.232235E11
 
 










Root MSE
37623


Dependent Mean
137525


R-Square
0.0000


Adj R-Sq
0.0000


AIC
6624.21515


AICC
6624.25555


SBC
6325.91893










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
137525
2172.144314
63.31











Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 1


Effect Entered: Basement_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
1
2.012418E11
2.012418E11
270.16


Error
298
2.219817E11
744904950
 


Corrected Total
299
4.232235E11
 
 










Root MSE
27293


Dependent Mean
137525


R-Square
0.4755


Adj R-Sq
0.4737


AIC
6432.62346


AICC
6432.70454


SBC
6138.03102










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
73904
4179.193780
17.68


Basement_Area
1
72.107717
4.387055
16.44










Entry Candidates


Rank
Effect
AIC




1
Basement_Area
6432.6235


2
Gr_Liv_Area
6461.1877


3
Age_Sold
6483.4097


4
Total_Bathroom
6492.0868


5
Garage_Area
6503.7574


6
Deck_Porch_Area
6561.6989


7
Lot_Area
6606.3138


8
Bedroom_AbvGr
6617.8389


























Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 2


Effect Entered: Gr_Liv_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
2
2.64483E11
1.322415E11
247.42


Error
297
1.587405E11
534479711
 


Corrected Total
299
4.232235E11
 
 










Root MSE
23119


Dependent Mean
137525


R-Square
0.6249


Adj R-Sq
0.6224


AIC
6334.02620


AICC
6334.16179


SBC
6043.13755










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
12664
6650.339855
1.90


Gr_Liv_Area
1
69.606974
6.399091
10.88


Basement_Area
1
52.309702
4.137885
12.64










Entry Candidates


Rank
Effect
AIC




1
Gr_Liv_Area
6334.0262


2
Age_Sold
6342.0095


3
Garage_Area
6351.3061


4
Total_Bathroom
6376.4084


5
Deck_Porch_Area
6405.4472


6
Lot_Area
6431.9372


7
Bedroom_AbvGr
6434.0931

























Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 3


Effect Entered: Age_Sold







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
3
3.207148E11
1.069049E11
308.69


Error
296
1.025087E11
346313132
 


Corrected Total
299
4.232235E11
 
 










Root MSE
18609


Dependent Mean
137525


R-Square
0.7578


Adj R-Sq
0.7553


AIC
6204.82927


AICC
6205.03335


SBC
5917.64440










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
53400
6235.076995
8.56


Gr_Liv_Area
1
68.106646
5.152294
13.22


Basement_Area
1
36.329120
3.559067
10.21


Age_Sold
1
-543.493346
42.651840
-12.74










Entry Candidates


Rank
Effect
AIC




1
Age_Sold
6204.8293


2
Garage_Area
6264.6656


3
Total_Bathroom
6296.3441


4
Deck_Porch_Area
6314.2811


5
Bedroom_AbvGr
6314.4078


6
Lot_Area
6335.7989
























Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 4


Effect Entered: Garage_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
4
3.33571E11
83392754480
274.40


Error
295
89652501590
303906785
 


Corrected Total
299
4.232235E11
 
 










Root MSE
17433


Dependent Mean
137525


R-Square
0.7882


Adj R-Sq
0.7853


AIC
6166.62734


AICC
6166.91403


SBC
5883.14625










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
43815
6023.907004
7.27


Gr_Liv_Area
1
61.238136
4.940722
12.39


Basement_Area
1
33.430181
3.363709
9.94


Garage_Area
1
42.984492
6.608851
6.50


Age_Sold
1
-455.704354
42.173481
-10.81










Entry Candidates


Rank
Effect
AIC




1
Garage_Area
6166.6273


2
Deck_Porch_Area
6184.1900


3
Bedroom_AbvGr
6194.0980


4
Total_Bathroom
6201.3633


5
Lot_Area
6201.6954























Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 5


Effect Entered: Deck_Porch_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
5
3.392788E11
67855752389
237.65


Error
294
83944757568
285526386
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16898


Dependent Mean
137525


R-Square
0.8017


Adj R-Sq
0.7983


AIC
6148.89269


AICC
6149.27625


SBC
5869.11538










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
46009
5859.485517
7.85


Gr_Liv_Area
1
58.386514
4.831268
12.09


Basement_Area
1
30.554240
3.323249
9.19


Garage_Area
1
40.158112
6.436997
6.24


Deck_Porch_Area
1
35.720258
7.989240
4.47


Age_Sold
1
-447.254040
40.921927
-10.93










Entry Candidates


Rank
Effect
AIC




1
Deck_Porch_Area
6148.8927


2
Bedroom_AbvGr
6158.7554


3
Total_Bathroom
6164.5138


4
Lot_Area
6166.4302






















Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 6


Effect Entered: Bedroom_AbvGr







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
6
3.410749E11
56845818595
202.75


Error
293
82148607939
280370676
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
48620
5897.324643
8.24


Gr_Liv_Area
1
65.097413
5.472624
11.90


Basement_Area
1
31.279351
3.305546
9.46


Garage_Area
1
38.728785
6.403565
6.05


Deck_Porch_Area
1
32.487956
8.019119
4.05


Age_Sold
1
-434.199118
40.877494
-10.62


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53










Entry Candidates


Rank
Effect
AIC




1
Bedroom_AbvGr
6144.4040


2
Total_Bathroom
6147.6813


3
Lot_Area
6148.3694





















Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 7


Effect Entered: Lot_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
7
3.424508E11
48921543221
176.86


Error
292
80772716963
276618894
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16632


Dependent Mean
137525


R-Square
0.8091


Adj R-Sq
0.8046


AIC
6141.33678


AICC
6141.95747


SBC
5868.96704










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
47463
5880.674041
8.07


Gr_Liv_Area
1
65.303724
5.436672
12.01


Basement_Area
1
29.849078
3.345400
8.92


Garage_Area
1
36.309606
6.452405
5.63


Deck_Porch_Area
1
32.052554
7.967677
4.02


Lot_Area
1
0.708127
0.317512
2.23


Age_Sold
1
-447.198682
41.019314
-10.90


Bedroom_AbvGr
1
-5042.766498
1687.928168
-2.99










Entry Candidates


Rank
Effect
AIC




1
Lot_Area
6141.3368


2
Total_Bathroom
6143.7813




















Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Stepwise Selection: Step 8


Effect Entered: Total_Bathroom







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57










Entry Candidates


Rank
Effect
AIC




1
Total_Bathroom
6140.7956



















Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure








Stepwise Selection Summary


Step
Effect
Entered
Effect
Removed
Number
Effects In
AIC




* Optimal Value of Criterion




0
Intercept
 
1
6624.2151


1
Basement_Area
 
2
6432.6235


2
Gr_Liv_Area
 
3
6334.0262


3
Age_Sold
 
4
6204.8293


4
Garage_Area
 
5
6166.6273


5
Deck_Porch_Area
 
6
6148.8927


6
Bedroom_AbvGr
 
7
6144.4040


7
Lot_Area
 
8
6141.3368


8
Total_Bathroom
 
9
6140.7956*










Selection stopped because all effects are in the final model.




























































































































































































































































































































Stepwise Model Selection for SalePrice - AIC 


The GLMSELECT Procedure
Selected Model


The selected model is the model at the last step (Step 8).







Effects:
Intercept Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom










Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57









Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure







Data Set
STATDATA.AMESHOUSING3


Dependent Variable
SalePrice


Selection Method
Stepwise


Select Criterion
BIC


Stop Criterion
BIC


Effect Hierarchy Enforced
None










Number of Observations Read
300


Number of Observations Used
300










Dimensions




Number of Effects
9


Number of Parameters
9








Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 0


Effect Entered: Intercept







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
0
0
.
.


Error
299
4.232235E11
1415463276
 


Corrected Total
299
4.232235E11
 
 










Root MSE
37623


Dependent Mean
137525


R-Square
0.0000


Adj R-Sq
0.0000


AIC
6624.21515


AICC
6624.25555


BIC
6321.30959


C(p)
1239.71831


SBC
6325.91893










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
137525
2172.144314
63.31











Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 1


Effect Entered: Basement_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
1
2.012418E11
2.012418E11
270.16


Error
298
2.219817E11
744904950
 


Corrected Total
299
4.232235E11
 
 










Root MSE
27293


Dependent Mean
137525


R-Square
0.4755


Adj R-Sq
0.4737


AIC
6432.62346


AICC
6432.70454


BIC
6129.32244


C(p)
510.53666


SBC
6138.03102










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
73904
4179.193780
17.68


Basement_Area
1
72.107717
4.387055
16.44










Entry Candidates


Rank
Effect
BIC




1
Basement_Area
6129.3224


2
Gr_Liv_Area
6157.6644


3
Age_Sold
6179.7247


4
Total_Bathroom
6188.3413


5
Garage_Area
6199.9327


6
Deck_Porch_Area
6257.5170


7
Lot_Area
6301.8947


8
Bedroom_AbvGr
6313.3634


























Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 2


Effect Entered: Gr_Liv_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
2
2.64483E11
1.322415E11
247.42


Error
297
1.587405E11
534479711
 


Corrected Total
299
4.232235E11
 
 










Root MSE
23119


Dependent Mean
137525


R-Square
0.6249


Adj R-Sq
0.6224


AIC
6334.02620


AICC
6334.16179


BIC
6030.68657


C(p)
282.75938


SBC
6043.13755










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
12664
6650.339855
1.90


Gr_Liv_Area
1
69.606974
6.399091
10.88


Basement_Area
1
52.309702
4.137885
12.64










Entry Candidates


Rank
Effect
BIC




1
Gr_Liv_Area
6030.6866


2
Age_Sold
6038.5613


3
Garage_Area
6047.7342


4
Total_Bathroom
6072.5166


5
Deck_Porch_Area
6101.2106


6
Lot_Area
6127.4086


7
Bedroom_AbvGr
6129.5416

























Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 3


Effect Entered: Age_Sold







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
3
3.207148E11
1.069049E11
308.69


Error
296
1.025087E11
346313132
 


Corrected Total
299
4.232235E11
 
 










Root MSE
18609


Dependent Mean
137525


R-Square
0.7578


Adj R-Sq
0.7553


AIC
6204.82927


AICC
6205.03335


BIC
5903.19742


C(p)
80.44973


SBC
5917.64440










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
53400
6235.076995
8.56


Gr_Liv_Area
1
68.106646
5.152294
13.22


Basement_Area
1
36.329120
3.559067
10.21


Age_Sold
1
-543.493346
42.651840
-12.74










Entry Candidates


Rank
Effect
BIC




1
Age_Sold
5903.1974


2
Garage_Area
5961.7128


3
Total_Bathroom
5992.7636


4
Deck_Porch_Area
6010.3666


5
Bedroom_AbvGr
6010.4910


6
Lot_Area
6031.5035
























Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 4


Effect Entered: Garage_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
4
3.33571E11
83392754480
274.40


Error
295
89652501590
303906785
 


Corrected Total
299
4.232235E11
 
 










Root MSE
17433


Dependent Mean
137525


R-Square
0.7882


Adj R-Sq
0.7853


AIC
6166.62734


AICC
6166.91403


BIC
5865.82469


C(p)
35.73873


SBC
5883.14625










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
43815
6023.907004
7.27


Gr_Liv_Area
1
61.238136
4.940722
12.39


Basement_Area
1
33.430181
3.363709
9.94


Garage_Area
1
42.984492
6.608851
6.50


Age_Sold
1
-455.704354
42.173481
-10.81










Entry Candidates


Rank
Effect
BIC




1
Garage_Area
5865.8247


2
Deck_Porch_Area
5882.8416


3
Bedroom_AbvGr
5892.4510


4
Total_Bathroom
5899.5016


5
Lot_Area
5899.8240























Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 5


Effect Entered: Deck_Porch_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
5
3.392788E11
67855752389
237.65


Error
294
83944757568
285526386
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16898


Dependent Mean
137525


R-Square
0.8017


Adj R-Sq
0.7983


AIC
6148.89269


AICC
6149.27625


BIC
5848.69541


C(p)
17.00051


SBC
5869.11538










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
46009
5859.485517
7.85


Gr_Liv_Area
1
58.386514
4.831268
12.09


Basement_Area
1
30.554240
3.323249
9.19


Garage_Area
1
40.158112
6.436997
6.24


Deck_Porch_Area
1
35.720258
7.989240
4.47


Age_Sold
1
-447.254040
40.921927
-10.93










Entry Candidates


Rank
Effect
BIC




1
Deck_Porch_Area
5848.6954


2
Bedroom_AbvGr
5858.1723


3
Total_Bathroom
5863.7094


4
Lot_Area
5865.5528






















Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 6


Effect Entered: Bedroom_AbvGr







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
6
3.410749E11
56845818595
202.75


Error
293
82148607939
280370676
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


BIC
5844.47548


C(p)
12.47448


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
48620
5897.324643
8.24


Gr_Liv_Area
1
65.097413
5.472624
11.90


Basement_Area
1
31.279351
3.305546
9.46


Garage_Area
1
38.728785
6.403565
6.05


Deck_Porch_Area
1
32.487956
8.019119
4.05


Age_Sold
1
-434.199118
40.877494
-10.62


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53










Entry Candidates


Rank
Effect
BIC




1
Bedroom_AbvGr
5844.4755


2
Total_Bathroom
5847.5999


3
Lot_Area
5848.2561





















Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 7


Effect Entered: Lot_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
7
3.424508E11
48921543221
176.86


Error
292
80772716963
276618894
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16632


Dependent Mean
137525


R-Square
0.8091


Adj R-Sq
0.8046


AIC
6141.33678


AICC
6141.95747


BIC
5841.69151


C(p)
9.47539


SBC
5868.96704










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
47463
5880.674041
8.07


Gr_Liv_Area
1
65.303724
5.436672
12.01


Basement_Area
1
29.849078
3.345400
8.92


Garage_Area
1
36.309606
6.452405
5.63


Deck_Porch_Area
1
32.052554
7.967677
4.02


Lot_Area
1
0.708127
0.317512
2.23


Age_Sold
1
-447.198682
41.019314
-10.90


Bedroom_AbvGr
1
-5042.766498
1687.928168
-2.99










Entry Candidates


Rank
Effect
BIC




1
Lot_Area
5841.6915


2
Total_Bathroom
5844.0039




















Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Stepwise Selection: Step 8


Effect Entered: Total_Bathroom







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


BIC
5841.35042


C(p)
9.00000


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57










Entry Candidates


Rank
Effect
BIC




1
Total_Bathroom
5841.3504



















Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure








Stepwise Selection Summary


Step
Effect
Entered
Effect
Removed
Number
Effects In
BIC




* Optimal Value of Criterion




0
Intercept
 
1
6321.3096


1
Basement_Area
 
2
6129.3224


2
Gr_Liv_Area
 
3
6030.6866


3
Age_Sold
 
4
5903.1974


4
Garage_Area
 
5
5865.8247


5
Deck_Porch_Area
 
6
5848.6954


6
Bedroom_AbvGr
 
7
5844.4755


7
Lot_Area
 
8
5841.6915


8
Total_Bathroom
 
9
5841.3504*










Selection stopped because all effects are in the final model.























































































































































































































































































































































Stepwise Model Selection for SalePrice - BIC 


The GLMSELECT Procedure
Selected Model


The selected model is the model at the last step (Step 8).







Effects:
Intercept Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom










Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


BIC
5841.35042


C(p)
9.00000


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57









Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure







Data Set
STATDATA.AMESHOUSING3


Dependent Variable
SalePrice


Selection Method
Stepwise


Select Criterion
AICC


Stop Criterion
AICC


Effect Hierarchy Enforced
None










Number of Observations Read
300


Number of Observations Used
300










Dimensions




Number of Effects
9


Number of Parameters
9








Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 0


Effect Entered: Intercept







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
0
0
.
.


Error
299
4.232235E11
1415463276
 


Corrected Total
299
4.232235E11
 
 










Root MSE
37623


Dependent Mean
137525


R-Square
0.0000


Adj R-Sq
0.0000


AIC
6624.21515


AICC
6624.25555


SBC
6325.91893










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
137525
2172.144314
63.31











Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 1


Effect Entered: Basement_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
1
2.012418E11
2.012418E11
270.16


Error
298
2.219817E11
744904950
 


Corrected Total
299
4.232235E11
 
 










Root MSE
27293


Dependent Mean
137525


R-Square
0.4755


Adj R-Sq
0.4737


AIC
6432.62346


AICC
6432.70454


SBC
6138.03102










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
73904
4179.193780
17.68


Basement_Area
1
72.107717
4.387055
16.44










Entry Candidates


Rank
Effect
AICC




1
Basement_Area
6432.7045


2
Gr_Liv_Area
6461.2688


3
Age_Sold
6483.4907


4
Total_Bathroom
6492.1679


5
Garage_Area
6503.8385


6
Deck_Porch_Area
6561.7799


7
Lot_Area
6606.3949


8
Bedroom_AbvGr
6617.9200


























Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 2


Effect Entered: Gr_Liv_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
2
2.64483E11
1.322415E11
247.42


Error
297
1.587405E11
534479711
 


Corrected Total
299
4.232235E11
 
 










Root MSE
23119


Dependent Mean
137525


R-Square
0.6249


Adj R-Sq
0.6224


AIC
6334.02620


AICC
6334.16179


SBC
6043.13755










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
12664
6650.339855
1.90


Gr_Liv_Area
1
69.606974
6.399091
10.88


Basement_Area
1
52.309702
4.137885
12.64










Entry Candidates


Rank
Effect
AICC




1
Gr_Liv_Area
6334.1618


2
Age_Sold
6342.1451


3
Garage_Area
6351.4417


4
Total_Bathroom
6376.5440


5
Deck_Porch_Area
6405.5828


6
Lot_Area
6432.0728


7
Bedroom_AbvGr
6434.2287

























Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 3


Effect Entered: Age_Sold







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
3
3.207148E11
1.069049E11
308.69


Error
296
1.025087E11
346313132
 


Corrected Total
299
4.232235E11
 
 










Root MSE
18609


Dependent Mean
137525


R-Square
0.7578


Adj R-Sq
0.7553


AIC
6204.82927


AICC
6205.03335


SBC
5917.64440










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
53400
6235.076995
8.56


Gr_Liv_Area
1
68.106646
5.152294
13.22


Basement_Area
1
36.329120
3.559067
10.21


Age_Sold
1
-543.493346
42.651840
-12.74










Entry Candidates


Rank
Effect
AICC




1
Age_Sold
6205.0334


2
Garage_Area
6264.8697


3
Total_Bathroom
6296.5482


4
Deck_Porch_Area
6314.4852


5
Bedroom_AbvGr
6314.6119


6
Lot_Area
6336.0030
























Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 4


Effect Entered: Garage_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
4
3.33571E11
83392754480
274.40


Error
295
89652501590
303906785
 


Corrected Total
299
4.232235E11
 
 










Root MSE
17433


Dependent Mean
137525


R-Square
0.7882


Adj R-Sq
0.7853


AIC
6166.62734


AICC
6166.91403


SBC
5883.14625










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
43815
6023.907004
7.27


Gr_Liv_Area
1
61.238136
4.940722
12.39


Basement_Area
1
33.430181
3.363709
9.94


Garage_Area
1
42.984492
6.608851
6.50


Age_Sold
1
-455.704354
42.173481
-10.81










Entry Candidates


Rank
Effect
AICC




1
Garage_Area
6166.9140


2
Deck_Porch_Area
6184.4767


3
Bedroom_AbvGr
6194.3847


4
Total_Bathroom
6201.6500


5
Lot_Area
6201.9821























Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 5


Effect Entered: Deck_Porch_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
5
3.392788E11
67855752389
237.65


Error
294
83944757568
285526386
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16898


Dependent Mean
137525


R-Square
0.8017


Adj R-Sq
0.7983


AIC
6148.89269


AICC
6149.27625


SBC
5869.11538










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
46009
5859.485517
7.85


Gr_Liv_Area
1
58.386514
4.831268
12.09


Basement_Area
1
30.554240
3.323249
9.19


Garage_Area
1
40.158112
6.436997
6.24


Deck_Porch_Area
1
35.720258
7.989240
4.47


Age_Sold
1
-447.254040
40.921927
-10.93










Entry Candidates


Rank
Effect
AICC




1
Deck_Porch_Area
6149.2763


2
Bedroom_AbvGr
6159.1390


3
Total_Bathroom
6164.8974


4
Lot_Area
6166.8138






















Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 6


Effect Entered: Bedroom_AbvGr







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
6
3.410749E11
56845818595
202.75


Error
293
82148607939
280370676
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
48620
5897.324643
8.24


Gr_Liv_Area
1
65.097413
5.472624
11.90


Basement_Area
1
31.279351
3.305546
9.46


Garage_Area
1
38.728785
6.403565
6.05


Deck_Porch_Area
1
32.487956
8.019119
4.05


Age_Sold
1
-434.199118
40.877494
-10.62


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53










Entry Candidates


Rank
Effect
AICC




1
Bedroom_AbvGr
6144.8988


2
Total_Bathroom
6148.1761


3
Lot_Area
6148.8642





















Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 7


Effect Entered: Lot_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
7
3.424508E11
48921543221
176.86


Error
292
80772716963
276618894
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16632


Dependent Mean
137525


R-Square
0.8091


Adj R-Sq
0.8046


AIC
6141.33678


AICC
6141.95747


SBC
5868.96704










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
47463
5880.674041
8.07


Gr_Liv_Area
1
65.303724
5.436672
12.01


Basement_Area
1
29.849078
3.345400
8.92


Garage_Area
1
36.309606
6.452405
5.63


Deck_Porch_Area
1
32.052554
7.967677
4.02


Lot_Area
1
0.708127
0.317512
2.23


Age_Sold
1
-447.198682
41.019314
-10.90


Bedroom_AbvGr
1
-5042.766498
1687.928168
-2.99










Entry Candidates


Rank
Effect
AICC




1
Lot_Area
6141.9575


2
Total_Bathroom
6144.4020




















Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Stepwise Selection: Step 8


Effect Entered: Total_Bathroom







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57










Entry Candidates


Rank
Effect
AICC




1
Total_Bathroom
6141.5569



















Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure








Stepwise Selection Summary


Step
Effect
Entered
Effect
Removed
Number
Effects In
AICC




* Optimal Value of Criterion




0
Intercept
 
1
6624.2555


1
Basement_Area
 
2
6432.7045


2
Gr_Liv_Area
 
3
6334.1618


3
Age_Sold
 
4
6205.0334


4
Garage_Area
 
5
6166.9140


5
Deck_Porch_Area
 
6
6149.2763


6
Bedroom_AbvGr
 
7
6144.8988


7
Lot_Area
 
8
6141.9575


8
Total_Bathroom
 
9
6141.5569*










Selection stopped because all effects are in the final model.




























































































































































































































































































































Stepwise Model Selection for SalePrice - AICC 


The GLMSELECT Procedure
Selected Model


The selected model is the model at the last step (Step 8).







Effects:
Intercept Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom










Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
8
3.431321E11
42891512314
155.84


Error
291
80091420996
275228251
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16590


Dependent Mean
137525


R-Square
0.8108


Adj R-Sq
0.8056


AIC
6140.79563


AICC
6141.55688


SBC
5872.12967










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
44347
6191.271944
7.16


Gr_Liv_Area
1
63.197764
5.585739
11.31


Basement_Area
1
28.692184
3.417034
8.40


Garage_Area
1
35.754191
6.445840
5.55


Deck_Porch_Area
1
31.370539
7.959436
3.94


Lot_Area
1
0.699495
0.316761
2.21


Age_Sold
1
-420.815037
44.219144
-9.52


Bedroom_AbvGr
1
-4834.848748
1688.858227
-2.86


Total_Bathroom
1
3022.124723
1920.839066
1.57









Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure







Data Set
STATDATA.AMESHOUSING3


Dependent Variable
SalePrice


Selection Method
Stepwise


Select Criterion
SBC


Stop Criterion
SBC


Effect Hierarchy Enforced
None










Number of Observations Read
300


Number of Observations Used
300










Dimensions




Number of Effects
9


Number of Parameters
9








Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 0


Effect Entered: Intercept







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
0
0
.
.


Error
299
4.232235E11
1415463276
 


Corrected Total
299
4.232235E11
 
 










Root MSE
37623


Dependent Mean
137525


R-Square
0.0000


Adj R-Sq
0.0000


AIC
6624.21515


AICC
6624.25555


SBC
6325.91893










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
137525
2172.144314
63.31











Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 1


Effect Entered: Basement_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
1
2.012418E11
2.012418E11
270.16


Error
298
2.219817E11
744904950
 


Corrected Total
299
4.232235E11
 
 










Root MSE
27293


Dependent Mean
137525


R-Square
0.4755


Adj R-Sq
0.4737


AIC
6432.62346


AICC
6432.70454


SBC
6138.03102










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
73904
4179.193780
17.68


Basement_Area
1
72.107717
4.387055
16.44










Entry Candidates


Rank
Effect
SBC




1
Basement_Area
6138.0310


2
Gr_Liv_Area
6166.5953


3
Age_Sold
6188.8172


4
Total_Bathroom
6197.4944


5
Garage_Area
6209.1649


6
Deck_Porch_Area
6267.1064


7
Lot_Area
6311.7214


8
Bedroom_AbvGr
6323.2465


























Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 2


Effect Entered: Gr_Liv_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
2
2.64483E11
1.322415E11
247.42


Error
297
1.587405E11
534479711
 


Corrected Total
299
4.232235E11
 
 










Root MSE
23119


Dependent Mean
137525


R-Square
0.6249


Adj R-Sq
0.6224


AIC
6334.02620


AICC
6334.16179


SBC
6043.13755










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
12664
6650.339855
1.90


Gr_Liv_Area
1
69.606974
6.399091
10.88


Basement_Area
1
52.309702
4.137885
12.64










Entry Candidates


Rank
Effect
SBC




1
Gr_Liv_Area
6043.1375


2
Age_Sold
6051.1208


3
Garage_Area
6060.4174


4
Total_Bathroom
6085.5197


5
Deck_Porch_Area
6114.5586


6
Lot_Area
6141.0486


7
Bedroom_AbvGr
6143.2044

























Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 3


Effect Entered: Age_Sold







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
3
3.207148E11
1.069049E11
308.69


Error
296
1.025087E11
346313132
 


Corrected Total
299
4.232235E11
 
 










Root MSE
18609


Dependent Mean
137525


R-Square
0.7578


Adj R-Sq
0.7553


AIC
6204.82927


AICC
6205.03335


SBC
5917.64440










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
53400
6235.076995
8.56


Gr_Liv_Area
1
68.106646
5.152294
13.22


Basement_Area
1
36.329120
3.559067
10.21


Age_Sold
1
-543.493346
42.651840
-12.74










Entry Candidates


Rank
Effect
SBC




1
Age_Sold
5917.6444


2
Garage_Area
5977.4808


3
Total_Bathroom
6009.1592


4
Deck_Porch_Area
6027.0962


5
Bedroom_AbvGr
6027.2230


6
Lot_Area
6048.6141
























Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 4


Effect Entered: Garage_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
4
3.33571E11
83392754480
274.40


Error
295
89652501590
303906785
 


Corrected Total
299
4.232235E11
 
 










Root MSE
17433


Dependent Mean
137525


R-Square
0.7882


Adj R-Sq
0.7853


AIC
6166.62734


AICC
6166.91403


SBC
5883.14625










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
43815
6023.907004
7.27


Gr_Liv_Area
1
61.238136
4.940722
12.39


Basement_Area
1
33.430181
3.363709
9.94


Garage_Area
1
42.984492
6.608851
6.50


Age_Sold
1
-455.704354
42.173481
-10.81










Entry Candidates


Rank
Effect
SBC




1
Garage_Area
5883.1463


2
Deck_Porch_Area
5900.7089


3
Bedroom_AbvGr
5910.6169


4
Total_Bathroom
5917.8822


5
Lot_Area
5918.2143























Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 5


Effect Entered: Deck_Porch_Area







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
5
3.392788E11
67855752389
237.65


Error
294
83944757568
285526386
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16898


Dependent Mean
137525


R-Square
0.8017


Adj R-Sq
0.7983


AIC
6148.89269


AICC
6149.27625


SBC
5869.11538










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
46009
5859.485517
7.85


Gr_Liv_Area
1
58.386514
4.831268
12.09


Basement_Area
1
30.554240
3.323249
9.19


Garage_Area
1
40.158112
6.436997
6.24


Deck_Porch_Area
1
35.720258
7.989240
4.47


Age_Sold
1
-447.254040
40.921927
-10.93










Entry Candidates


Rank
Effect
SBC




1
Deck_Porch_Area
5869.1154


2
Bedroom_AbvGr
5878.9781


3
Total_Bathroom
5884.7365


4
Lot_Area
5886.6529






















Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Stepwise Selection: Step 6


Effect Entered: Bedroom_AbvGr







Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
6
3.410749E11
56845818595
202.75


Error
293
82148607939
280370676
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
48620
5897.324643
8.24


Gr_Liv_Area
1
65.097413
5.472624
11.90


Basement_Area
1
31.279351
3.305546
9.46


Garage_Area
1
38.728785
6.403565
6.05


Deck_Porch_Area
1
32.487956
8.019119
4.05


Age_Sold
1
-434.199118
40.877494
-10.62


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53










Entry Candidates


Rank
Effect
SBC




1
Bedroom_AbvGr
5868.3305


2
Total_Bathroom
5871.6077


3
Lot_Area
5872.2959





















Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure








Stepwise Selection Summary


Step
Effect
Entered
Effect
Removed
Number
Effects In
SBC




* Optimal Value of Criterion




0
Intercept
 
1
6325.9189


1
Basement_Area
 
2
6138.0310


2
Gr_Liv_Area
 
3
6043.1375


3
Age_Sold
 
4
5917.6444


4
Garage_Area
 
5
5883.1463


5
Deck_Porch_Area
 
6
5869.1154


6
Bedroom_AbvGr
 
7
5868.3305*










Selection stopped at a local minimum of the SBC criterion.










Stop Details


Candidate
For
Effect
Candidate
SBC
  
Compare
SBC




Entry
Lot_Area
5868.9670
>
5868.3305


Removal
Bedroom_AbvGr
5869.1154
>
5868.3305
































































































































































































































Stepwise Model Selection for SalePrice - SBC 


The GLMSELECT Procedure
Selected Model


The selected model is the model at the last step (Step 6).







Effects:
Intercept Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area Age_Sold Bedroom_AbvGr










Analysis of Variance


Source
DF
Sum of
Squares
Mean
Square
F Value




Model
6
3.410749E11
56845818595
202.75


Error
293
82148607939
280370676
 


Corrected Total
299
4.232235E11
 
 










Root MSE
16744


Dependent Mean
137525


R-Square
0.8059


Adj R-Sq
0.8019


AIC
6144.40398


AICC
6144.89882


SBC
5868.33046










Parameter Estimates


Parameter
DF
Estimate
Standard
Error
t Value




Intercept
1
48620
5897.324643
8.24


Gr_Liv_Area
1
65.097413
5.472624
11.90


Basement_Area
1
31.279351
3.305546
9.46


Garage_Area
1
38.728785
6.403565
6.05


Deck_Porch_Area
1
32.487956
8.019119
4.05


Age_Sold
1
-434.199118
40.877494
-10.62


Bedroom_AbvGr
1
-4189.095026
1655.065743
-2.53

Include= option



In [16]:

    
title "Forcing Variables into a Stepwise Model";
proc reg data=exercise;
model Pushups = Max_Pulse Age Rest_Pulse Run_Pulse /
selection = stepwise include=1;
run;
quit;









    Out[16]:









  
  
  




298  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
299  
300  title "Forcing Variables into a Stepwise Model";
301  proc reg data=exercise;
ERROR: File WORK.EXERCISE.DATA does not exist.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
302  model Pushups = Max_Pulse Age Rest_Pulse Run_Pulse /
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
303  selection = stepwise include=1;
304  run;
NOTE: PROCEDURE REG used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
305  quit;
306  ods html5 close;ods listing;

307

Influential Observations

The INFLUENCE option gives you statistics that show you how much each observation changes aspects of the regression depending on whether that observation is included. The R option gives you more details about the residuals, as well as the value of the Cook’s D statistic.

Plot Name	Description
Cooksd Cook’s D statistic	(the effect on the predicted value)
Rstudentbypredicted	Externally Studentized residuals by predicted value
DFFITS	The difference in the overall effect on the betas
DFBETAS	The difference on each beta (one computed for each variable)

The Cook's D statistic measures the distance between the set of parameter estimates with that observation deleted from your regression analysis and the set of parameter estimates with all the observations in your regression analysis. If any observation has a Cook's D statistic greater than 4 divided by n, where n is the sample size, that observation is influential. The Cook's D statistic is most useful for identifying influential observations when the purpose of your model is parameter estimation.

STUDENT residuals are calculated by dividing the residuals by their standard errors, so you can think of each STUDENT residual as roughly equivalent to a z-score. Typically, people consider z-scores large if their absolute value is greater than 2. So, for a relatively small sample size, a cutoff value of plus or minus 2 is reasonable for STUDENT residuals. However, with a large sample, it's very likely that even more STUDENT residuals greater than plus or minus 2 will occur just by chance. So, for larger data sets, you should typically use a larger cutoff value, the absolute value of 3.

SAS computes the RStudent value by computing the residual between each data point and a regression line that was computed with that data point removed, and then dividing by the standard error. Why is this computation necessary? If you have a very influential data point, it will pull the line (or surface) closer to the point. Then, when you compute the residual, you get a smaller value than if you had computed the regression with the data point omitted. Various texts refer to the RStudent residuals as deleted residuals or externally standardized residuals.You can use two rules of thumb to evaluate RSTUDENT residuals. First, if the RSTUDENT residual is different from the STUDENT residual, the observation is probably influential. Second, if the absolute value of the RSTUDENT residuals is greater than 2 or 3, you've probably detected an influential observation.

DFFITS measures the impact that each observation has on its own predicted value. For each observation, DFFITS is calculated using two predicted values. The first predicted value is calculated from a model using the entire data set to estimate model parameters. The second predicted value is calculated from a model using the data set with that particular observation removed to estimate model parameters. The difference between the two predicted values is divided by the standard error of the predicted value, without the observation. If the standardized difference between these predicted values is large, that particular observation has a large effect on the model fit. The rule of thumb for DFFITS has two versions. The general cutoff value is 2. The more precise cutoff is 2 times the square root of p divided by n, where p is the number of terms in the model, including the intercept, and n is the sample size. If the absolute value of DFFITS for any observation is greater than this cutoff value, you've detected an influential observation. DFFITS is most useful for predictive models.

DFBETAS measure the change in each parameter estimate. One DFBETA is calculated per predictor variable per observation. Each DFBETA is calculated by taking the estimated coefficient for that particular predictor variable, using all the data, and subtracting the estimated coefficient for that particular predictor variable with the current observation removed. This difference in the betas is divided by its standard error. This calculation is repeated for all predictor variables and all observations. Large DFBETAS indicate observations that are influential in estimating a given parameter. For DFBETAS, you use the same two rules of thumb as for DFFITS. The general cutoff value is 2. The more precise cutoff is $2{\sqrt{1/n}}$, where n is the sample size.

The DFBETAS plot is a panel plot. It contains one plot for each parameter. In this case, because we have so many parameters, SAS created two panels.

You can use STUDENT residuals to detect outliers. To detect influential observations, you can use RSTUDENT residuals and the DFFITS and Cook's D statistics.

What to do with infuential observations?

First, recheck for data entry errors.

Second, if the data appears to be valid, consider whether you have an adequate model. A different model might fit the data better. Here's one rule of thumb: Divide the number of influential observations you detect by the number of observations in your data set. If the result is greater than 5%, you probably have the wrong model. You might need a model that uses higher order terms.

Third, determine whether the influential observation is valid but just unusual. If you had a larger sample size there might be more observations similar to the unusual one. You might have to collect more data to confirm the relationship suggested by the influential observation.



In [17]:

    
%let interval=Gr_Liv_Area Basement_Area Garage_Area Deck_Porch_Area 
              Lot_Area Age_Sold Bedroom_AbvGr Total_Bathroom ;

ods select none;
proc glmselect data=statdata.ameshousing3 plots=all;
   STEPWISE: model SalePrice = &interval / selection=stepwise
                   details=steps select=SL slentry=0.05 slstay=0.05;
   title "Stepwise Model Selection for SalePrice - SL 0.05";
run;
quit;
ods select all;

ods graphics on;
ods output RSTUDENTBYPREDICTED=Rstud 
           COOKSDPLOT=Cook
           DFFITSPLOT=Dffits 
           DFBETASPANEL=Dfbs;
proc reg data=statdata.ameshousing3 
         plots(unpack only label)=
              (RSTUDENTBYPREDICTED 
               COOKSD 
               DFFITS 
               DFBETAS);
   SigLimit: model SalePrice = &_GLSIND; /**/
   title 'SigLimit Model - Plots of Diagnostic Statistics';
run;
quit;









    Out[17]:










SAS Output







SigLimit Model - Plots of Diagnostic Statistics 


The REG Procedure
Model: SigLimit
Dependent Variable: SalePrice Sale price in dollars










Number of Observations Read
300


Number of Observations Used
300










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
7
3.424508E11
48921543221
176.86
<.0001


Error
292
80772716963
276618894
 
 


Corrected Total
299
4.232235E11
 
 
 










Root MSE
16632
R-Square
0.8091


Dependent Mean
137525
Adj R-Sq
0.8046


Coeff Var
12.09371
 
 










Parameter Estimates


Variable
Label
DF
Parameter
Estimate

Standard
Error

t Value
Pr > |t|




Intercept
Intercept
1
47463
5880.67404
8.07
<.0001


Gr_Liv_Area
Above grade (ground) living area square feet
1
65.30372
5.43667
12.01
<.0001


Basement_Area
Basement area in square feet
1
29.84908
3.34540
8.92
<.0001


Garage_Area
Size of garage in square feet
1
36.30961
6.45241
5.63
<.0001


Deck_Porch_Area
Total area of decks and porches in square feet
1
32.05255
7.96768
4.02
<.0001


Lot_Area
Lot size in square feet
1
0.70813
0.31751
2.23
0.0265


Age_Sold
Age of house when sold, in years
1
-447.19868
41.01931
-10.90
<.0001


Bedroom_AbvGr
Bedrooms above grade
1
-5042.76650
1687.92817
-2.99
0.0031








SigLimit Model - Plots of Diagnostic Statistics 


The REG Procedure
Model: SigLimit
Dependent Variable: SalePrice Sale price in dollars

Now let’s look at the Dffits data set. We see DFFITS influence statistics in the DFFITS column. But notice that for some observations like 7, 21, and 22 there are missing values in the DFFITS column. For observations that are flagged as influential by DFFITS, the statistics are in the DFFITSOUT column rather than the DFFITS column. Because the DFFITS values are not all in the same column, if we want to change the cutoff or ask questions about the DFFITS values we’ll have to do a little extra work.

Go back to the DFBETAS panel plot. Here we see the order of the variables in the _GLSIND macro: above grade living area, basement area, garage area, deck/porch area, lot area, age sold, and bedroom above grade. So in the Dfbs data set, _DFBETAS1 and _DFBETASOUT1 are for the intercept, _DFBETAS2 and _DFBETASOUT2 are for above grade living area, _DFBETAS3 and _DFBETASOUT3 are for basement area, and so forth, ending with _DFBETAS8 and _DFBETASOUT8 for the last predictor variable, bedroom above grade.



In [18]:

    
/* Before running the code below,*/
 /* run the code from the previous demo, 
 /* Looking for Influential Observations, Part 1.*/
 /* Run both programs in the same SAS session.*/

title;

/*Check outLevlabel column*/
proc print data=Rstud noobs;
run;
/*Check CooksDLabel column*/
proc print data=Cook noobs;
run;
/*Check DFFITSOUT column*/
proc print data=Dffits noobs;
run;









    Out[18]:










SAS Output










Model
Dependent
RStudent
PredictedValue
outLevLabel
Observation




SigLimit
SalePrice
1.73092
185283.46
.
1


SigLimit
SalePrice
0.67964
180284.34
.
2


SigLimit
SalePrice
0.63948
104541.46
.
3


SigLimit
SalePrice
-0.58261
169597.56
.
4


SigLimit
SalePrice
1.32153
158490.53
.
5


SigLimit
SalePrice
-0.05738
125932.40
.
6


SigLimit
SalePrice
1.92473
174736.56
.
7


SigLimit
SalePrice
0.36232
153008.35
.
8


SigLimit
SalePrice
1.47568
156222.61
.
9


SigLimit
SalePrice
0.10202
140446.83
.
10


SigLimit
SalePrice
0.15571
125421.52
.
11


SigLimit
SalePrice
-1.12570
150512.33
.
12


SigLimit
SalePrice
1.65684
150725.00
.
13


SigLimit
SalePrice
0.54343
126013.31
.
14


SigLimit
SalePrice
0.15339
151460.95
.
15


SigLimit
SalePrice
0.55394
125741.85
.
16


SigLimit
SalePrice
-0.56164
200736.13
17
17


SigLimit
SalePrice
-0.64482
90640.01
.
18


SigLimit
SalePrice
-0.06749
120117.08
.
19


SigLimit
SalePrice
-0.05783
124845.09
.
20


SigLimit
SalePrice
-1.47822
123941.30
.
21


SigLimit
SalePrice
1.40038
71388.82
22
22


SigLimit
SalePrice
1.24923
117326.83
.
23


SigLimit
SalePrice
0.62756
147826.80
.
24


SigLimit
SalePrice
-0.82201
173773.56
.
25


SigLimit
SalePrice
0.65376
175204.37
.
26


SigLimit
SalePrice
-3.10785
164685.93
27
27


SigLimit
SalePrice
1.15239
135960.27
.
28


SigLimit
SalePrice
0.46690
151302.69
.
29


SigLimit
SalePrice
0.05430
142857.51
.
30


SigLimit
SalePrice
-0.05717
143939.38
.
31


SigLimit
SalePrice
0.53249
204219.38
.
32


SigLimit
SalePrice
1.21513
65033.63
.
33


SigLimit
SalePrice
0.81924
112472.69
.
34


SigLimit
SalePrice
0.88479
175434.69
.
35


SigLimit
SalePrice
-1.34524
195539.36
.
36


SigLimit
SalePrice
-0.67965
130671.07
.
37


SigLimit
SalePrice
-0.34924
163367.07
38
38


SigLimit
SalePrice
-1.26343
177720.93
.
39


SigLimit
SalePrice
0.48988
160512.06
.
40


SigLimit
SalePrice
0.69110
162657.36
.
41


SigLimit
SalePrice
0.02297
157619.53
.
42


SigLimit
SalePrice
0.23329
161138.75
.
43


SigLimit
SalePrice
-0.04237
149184.59
44
44


SigLimit
SalePrice
0.92723
140684.87
.
45


SigLimit
SalePrice
0.29326
119667.47
.
46


SigLimit
SalePrice
-0.87134
111349.97
.
47


SigLimit
SalePrice
-1.43170
161436.84
.
48


SigLimit
SalePrice
0.92215
197321.91
.
49


SigLimit
SalePrice
-0.03693
174603.31
.
50


SigLimit
SalePrice
0.73981
195329.36
.
51


SigLimit
SalePrice
0.87833
152704.98
.
52


SigLimit
SalePrice
0.91134
153153.56
.
53


SigLimit
SalePrice
2.61467
202411.72
54
54


SigLimit
SalePrice
-0.28483
146605.77
55
55


SigLimit
SalePrice
-0.83479
166728.61
.
56


SigLimit
SalePrice
-0.76002
201523.43
.
57


SigLimit
SalePrice
-2.28382
182308.15
58
58


SigLimit
SalePrice
-0.41697
168815.84
.
59


SigLimit
SalePrice
-0.63315
171983.45
.
60


SigLimit
SalePrice
0.24209
149988.70
.
61


SigLimit
SalePrice
-0.22428
125718.42
.
62


SigLimit
SalePrice
0.49808
74323.50
.
63


SigLimit
SalePrice
-0.81048
123422.00
.
64


SigLimit
SalePrice
-1.16144
178162.78
.
65


SigLimit
SalePrice
0.16913
152203.40
.
66


SigLimit
SalePrice
0.85141
98932.30
.
67


SigLimit
SalePrice
-1.50241
127136.23
68
68


SigLimit
SalePrice
-0.60947
109979.01
.
69


SigLimit
SalePrice
-0.33364
64475.20
.
70


SigLimit
SalePrice
0.19101
102877.33
.
71


SigLimit
SalePrice
-0.99242
170397.16
.
72


SigLimit
SalePrice
0.92842
128161.51
.
73


SigLimit
SalePrice
0.21375
124461.55
.
74


SigLimit
SalePrice
0.24757
140906.69
.
75


SigLimit
SalePrice
1.43702
105721.23
.
76


SigLimit
SalePrice
-1.34398
66908.57
.
77


SigLimit
SalePrice
0.17456
88421.89
.
78


SigLimit
SalePrice
0.83557
85719.01
.
79


SigLimit
SalePrice
0.55123
109946.96
.
80


SigLimit
SalePrice
0.49446
140904.78
.
81


SigLimit
SalePrice
1.03139
104038.07
.
82


SigLimit
SalePrice
-0.87156
74231.76
.
83


SigLimit
SalePrice
-1.27186
124528.32
.
84


SigLimit
SalePrice
1.60449
123581.90
.
85


SigLimit
SalePrice
-0.81066
134897.80
.
86


SigLimit
SalePrice
-0.68608
136292.05
.
87


SigLimit
SalePrice
-0.30186
88897.19
.
88


SigLimit
SalePrice
-1.00439
121523.60
.
89


SigLimit
SalePrice
-0.22594
195715.75
.
90


SigLimit
SalePrice
-0.72106
198875.24
.
91


SigLimit
SalePrice
0.14102
200669.96
.
92


SigLimit
SalePrice
0.04231
143304.71
.
93


SigLimit
SalePrice
-0.26050
112250.97
.
94


SigLimit
SalePrice
-0.67431
178073.25
.
95


SigLimit
SalePrice
0.06182
144484.24
.
96


SigLimit
SalePrice
-1.10129
99932.63
.
97


SigLimit
SalePrice
-0.63587
120448.29
.
98


SigLimit
SalePrice
-0.24039
125983.73
.
99


SigLimit
SalePrice
0.32177
54759.40
.
100


SigLimit
SalePrice
1.82531
117604.12
.
101


SigLimit
SalePrice
1.49890
112487.60
.
102


SigLimit
SalePrice
-1.14761
133786.21
.
103


SigLimit
SalePrice
-0.13260
203185.56
.
104


SigLimit
SalePrice
-0.04210
214188.14
.
105


SigLimit
SalePrice
2.12392
185162.46
106
106


SigLimit
SalePrice
-0.72262
176899.67
.
107


SigLimit
SalePrice
0.88737
115360.38
.
108


SigLimit
SalePrice
-0.59954
169883.79
.
109


SigLimit
SalePrice
-1.42296
120045.46
.
110


SigLimit
SalePrice
0.30137
107081.73
.
111


SigLimit
SalePrice
-0.42128
186946.19
.
112


SigLimit
SalePrice
0.39882
175298.03
.
113


SigLimit
SalePrice
-1.47492
135303.41
.
114


SigLimit
SalePrice
-1.27582
110478.54
.
115


SigLimit
SalePrice
0.70960
195301.07
.
116


SigLimit
SalePrice
0.24412
171989.52
.
117


SigLimit
SalePrice
0.14624
138622.04
.
118


SigLimit
SalePrice
-0.55332
154939.17
119
119


SigLimit
SalePrice
0.01238
191798.43
.
120


SigLimit
SalePrice
-0.26111
200287.17
.
121


SigLimit
SalePrice
0.48009
164037.44
.
122


SigLimit
SalePrice
5.74803
200551.43
123
123


SigLimit
SalePrice
0.43512
152865.66
.
124


SigLimit
SalePrice
0.00793
134869.73
.
125


SigLimit
SalePrice
2.24768
141717.99
126
126


SigLimit
SalePrice
0.99591
129032.85
.
127


SigLimit
SalePrice
-0.37046
147091.29
.
128


SigLimit
SalePrice
-0.03805
73125.19
.
129


SigLimit
SalePrice
-0.68865
63322.35
.
130


SigLimit
SalePrice
-0.00144
87023.76
.
131


SigLimit
SalePrice
0.25233
160821.63
.
132


SigLimit
SalePrice
-1.12364
142603.97
.
133


SigLimit
SalePrice
-0.43551
147207.74
.
134


SigLimit
SalePrice
0.05095
184163.82
.
135


SigLimit
SalePrice
0.07351
128786.87
.
136


SigLimit
SalePrice
-0.23172
93665.81
.
137


SigLimit
SalePrice
0.40930
141227.23
.
138


SigLimit
SalePrice
-0.00648
126007.23
.
139


SigLimit
SalePrice
0.39840
148417.74
.
140


SigLimit
SalePrice
0.09693
125905.45
.
141


SigLimit
SalePrice
0.29919
123145.41
.
142


SigLimit
SalePrice
0.15049
78839.31
.
143


SigLimit
SalePrice
0.32033
122748.77
.
144


SigLimit
SalePrice
-0.11850
70439.83
.
145


SigLimit
SalePrice
-0.92478
104544.20
.
146


SigLimit
SalePrice
-1.27441
103166.77
.
147


SigLimit
SalePrice
0.78224
122195.41
.
148


SigLimit
SalePrice
0.69087
116235.62
.
149


SigLimit
SalePrice
-0.26820
133938.14
.
150


SigLimit
SalePrice
2.59275
193681.17
151
151


SigLimit
SalePrice
1.08185
159203.37
.
152


SigLimit
SalePrice
0.52059
146409.78
.
153


SigLimit
SalePrice
-0.72411
111453.07
.
154


SigLimit
SalePrice
0.05967
118026.41
.
155


SigLimit
SalePrice
-0.61318
117496.28
.
156


SigLimit
SalePrice
-0.61145
135112.44
.
157


SigLimit
SalePrice
-1.00812
95513.93
.
158


SigLimit
SalePrice
0.19793
151735.89
.
159


SigLimit
SalePrice
-0.68584
121323.11
.
160


SigLimit
SalePrice
-0.21195
144481.20
.
161


SigLimit
SalePrice
-1.10805
155767.47
.
162


SigLimit
SalePrice
-0.29627
96773.80
.
163


SigLimit
SalePrice
0.45681
122943.93
.
164


SigLimit
SalePrice
0.60217
130144.43
.
165


SigLimit
SalePrice
1.44262
144609.35
.
166


SigLimit
SalePrice
0.80433
131773.36
.
167


SigLimit
SalePrice
-2.59767
146278.21
168
168


SigLimit
SalePrice
1.61556
99443.79
.
169


SigLimit
SalePrice
-0.05003
172327.76
.
170


SigLimit
SalePrice
-0.38911
166925.47
.
171


SigLimit
SalePrice
-0.24622
177010.68
.
172


SigLimit
SalePrice
-1.79116
124076.52
.
173


SigLimit
SalePrice
0.69574
150148.52
.
174


SigLimit
SalePrice
1.04145
152822.66
.
175


SigLimit
SalePrice
0.28085
187724.52
.
176


SigLimit
SalePrice
-0.38792
119875.25
.
177


SigLimit
SalePrice
0.77212
194288.73
.
178


SigLimit
SalePrice
0.38575
159232.71
.
179


SigLimit
SalePrice
0.24326
157997.72
.
180


SigLimit
SalePrice
0.06570
213928.57
.
181


SigLimit
SalePrice
-0.60823
130816.29
.
182


SigLimit
SalePrice
-0.30941
150081.25
.
183


SigLimit
SalePrice
1.79581
188445.42
.
184


SigLimit
SalePrice
2.32326
176138.08
185
185


SigLimit
SalePrice
-0.54100
135961.91
.
186


SigLimit
SalePrice
-0.56713
152399.35
.
187


SigLimit
SalePrice
-0.45499
165508.42
.
188


SigLimit
SalePrice
0.38980
117543.74
.
189


SigLimit
SalePrice
-0.07359
40493.35
.
190


SigLimit
SalePrice
-0.21912
67600.50
.
191


SigLimit
SalePrice
-0.29716
119416.87
.
192


SigLimit
SalePrice
0.10930
137605.85
.
193


SigLimit
SalePrice
-0.35854
121940.12
.
194


SigLimit
SalePrice
0.40428
121814.19
.
195


SigLimit
SalePrice
0.22952
115196.57
.
196


SigLimit
SalePrice
0.67131
88019.43
.
197


SigLimit
SalePrice
0.53258
116673.41
.
198


SigLimit
SalePrice
0.03946
99252.55
.
199


SigLimit
SalePrice
0.58003
120398.81
.
200


SigLimit
SalePrice
0.19734
105250.15
.
201


SigLimit
SalePrice
1.08712
92633.58
.
202


SigLimit
SalePrice
-0.93681
115382.70
.
203


SigLimit
SalePrice
-0.22029
128621.12
.
204


SigLimit
SalePrice
-0.21090
117437.98
.
205


SigLimit
SalePrice
-1.08161
146101.42
.
206


SigLimit
SalePrice
0.94341
112416.93
.
207


SigLimit
SalePrice
0.39181
124523.74
.
208


SigLimit
SalePrice
-0.43702
109213.80
.
209


SigLimit
SalePrice
0.85934
125873.65
.
210


SigLimit
SalePrice
2.10975
104712.47
211
211


SigLimit
SalePrice
0.22036
97356.70
.
212


SigLimit
SalePrice
-1.93457
159915.42
.
213


SigLimit
SalePrice
0.67544
75921.53
.
214


SigLimit
SalePrice
-0.50284
126828.77
.
215


SigLimit
SalePrice
1.63853
138157.97
.
216


SigLimit
SalePrice
-0.74474
132189.75
.
217


SigLimit
SalePrice
-2.33695
148218.66
218
218


SigLimit
SalePrice
-0.69575
91340.97
.
219


SigLimit
SalePrice
-0.71145
148182.02
.
220


SigLimit
SalePrice
-0.45135
187324.61
.
221


SigLimit
SalePrice
0.13364
210797.95
.
222


SigLimit
SalePrice
-0.22843
180197.44
.
223


SigLimit
SalePrice
0.85544
139342.44
.
224


SigLimit
SalePrice
1.07499
142307.51
.
225


SigLimit
SalePrice
-0.77388
144199.10
.
226


SigLimit
SalePrice
-2.89437
156963.98
227
227


SigLimit
SalePrice
-0.77275
146233.93
.
228


SigLimit
SalePrice
0.37209
90367.08
.
229


SigLimit
SalePrice
1.06380
124048.50
.
230


SigLimit
SalePrice
1.26826
69607.09
.
231


SigLimit
SalePrice
0.62236
109460.86
.
232


SigLimit
SalePrice
1.48799
129857.22
233
233


SigLimit
SalePrice
-0.66375
110987.94
.
234


SigLimit
SalePrice
0.71370
113122.77
.
235


SigLimit
SalePrice
-1.11387
152412.23
.
236


SigLimit
SalePrice
-0.52447
189153.64
.
237


SigLimit
SalePrice
1.68487
159821.24
.
238


SigLimit
SalePrice
-0.85944
153865.46
239
239


SigLimit
SalePrice
-2.22813
169363.69
240
240


SigLimit
SalePrice
-0.77278
106507.78
.
241


SigLimit
SalePrice
-2.46918
115199.27
242
242


SigLimit
SalePrice
-0.52060
162021.72
.
243


SigLimit
SalePrice
0.26878
125054.29
.
244


SigLimit
SalePrice
-0.21894
128105.62
.
245


SigLimit
SalePrice
1.07416
139774.90
.
246


SigLimit
SalePrice
0.48694
160940.60
.
247


SigLimit
SalePrice
0.32585
148102.81
.
248


SigLimit
SalePrice
-1.03618
207030.78
.
249


SigLimit
SalePrice
-1.03543
122490.39
.
250


SigLimit
SalePrice
-0.63306
128401.47
.
251


SigLimit
SalePrice
-1.23297
109273.29
.
252


SigLimit
SalePrice
-0.31669
138733.63
.
253


SigLimit
SalePrice
1.09835
106011.04
.
254


SigLimit
SalePrice
-0.03365
135557.77
.
255


SigLimit
SalePrice
0.41700
122299.53
.
256


SigLimit
SalePrice
-0.23995
180326.81
.
257


SigLimit
SalePrice
-0.72350
159257.13
.
258


SigLimit
SalePrice
-0.18707
173060.85
.
259


SigLimit
SalePrice
-1.51321
171362.25
.
260


SigLimit
SalePrice
-0.75747
152515.86
.
261


SigLimit
SalePrice
-0.92185
139603.04
.
262


SigLimit
SalePrice
0.53827
128112.81
.
263


SigLimit
SalePrice
-0.09424
136560.35
.
264


SigLimit
SalePrice
-1.08408
144803.65
.
265


SigLimit
SalePrice
1.02901
142994.96
.
266


SigLimit
SalePrice
1.24591
109320.63
.
267


SigLimit
SalePrice
0.51070
101528.44
.
268


SigLimit
SalePrice
-0.44712
100824.45
.
269


SigLimit
SalePrice
0.22674
142243.21
.
270


SigLimit
SalePrice
0.09172
153487.21
.
271


SigLimit
SalePrice
-0.42581
112027.05
.
272


SigLimit
SalePrice
0.71861
128339.62
273
273


SigLimit
SalePrice
0.71476
98200.28
.
274


SigLimit
SalePrice
-0.53945
113906.01
.
275


SigLimit
SalePrice
0.57659
126037.51
.
276


SigLimit
SalePrice
-0.83636
139962.61
.
277


SigLimit
SalePrice
-1.84846
94330.12
.
278


SigLimit
SalePrice
-0.00662
114612.48
.
279


SigLimit
SalePrice
-0.84776
128470.65
.
280


SigLimit
SalePrice
-0.47273
117797.52
.
281


SigLimit
SalePrice
0.40374
109870.20
.
282


SigLimit
SalePrice
-0.74196
137715.87
.
283


SigLimit
SalePrice
-1.33025
112920.30
.
284


SigLimit
SalePrice
2.25476
88302.50
285
285


SigLimit
SalePrice
1.10972
136157.87
.
286


SigLimit
SalePrice
0.01729
157717.21
.
287


SigLimit
SalePrice
0.65519
134454.58
288
288


SigLimit
SalePrice
-0.18930
196125.90
.
289


SigLimit
SalePrice
0.80483
203725.46
.
290


SigLimit
SalePrice
0.28340
115224.59
.
291


SigLimit
SalePrice
-2.49259
190247.41
292
292


SigLimit
SalePrice
-0.73991
129284.59
.
293


SigLimit
SalePrice
-1.89743
65812.00
.
294


SigLimit
SalePrice
-0.21271
160465.15
.
295


SigLimit
SalePrice
0.77795
132580.90
.
296


SigLimit
SalePrice
0.48832
63976.59
.
297


SigLimit
SalePrice
-1.35362
130156.08
.
298


SigLimit
SalePrice
-1.75727
182304.84
.
299


SigLimit
SalePrice
-0.59893
140907.79
.
300













Model
Dependent
CooksD
Observation
CooksDLabel




SigLimit
SalePrice
0.01260
1
.


SigLimit
SalePrice
0.00102
2
.


SigLimit
SalePrice
0.00186
3
.


SigLimit
SalePrice
0.00092
4
.


SigLimit
SalePrice
0.00904
5
.


SigLimit
SalePrice
0.00002
6
.


SigLimit
SalePrice
0.01782
7
7


SigLimit
SalePrice
0.00024
8
.


SigLimit
SalePrice
0.00486
9
.


SigLimit
SalePrice
0.00003
10
.


SigLimit
SalePrice
0.00004
11
.


SigLimit
SalePrice
0.00347
12
.


SigLimit
SalePrice
0.00501
13
.


SigLimit
SalePrice
0.00052
14
.


SigLimit
SalePrice
0.00004
15
.


SigLimit
SalePrice
0.00142
16
.


SigLimit
SalePrice
0.00577
17
.


SigLimit
SalePrice
0.00094
18
.


SigLimit
SalePrice
0.00001
19
.


SigLimit
SalePrice
0.00002
20
.


SigLimit
SalePrice
0.01368
21
21


SigLimit
SalePrice
0.01406
22
22


SigLimit
SalePrice
0.00350
23
.


SigLimit
SalePrice
0.00270
24
.


SigLimit
SalePrice
0.00196
25
.


SigLimit
SalePrice
0.00088
26
.


SigLimit
SalePrice
0.01176
27
.


SigLimit
SalePrice
0.00202
28
.


SigLimit
SalePrice
0.00056
29
.


SigLimit
SalePrice
0.00001
30
.


SigLimit
SalePrice
0.00001
31
.


SigLimit
SalePrice
0.00070
32
.


SigLimit
SalePrice
0.00612
33
.


SigLimit
SalePrice
0.00132
34
.


SigLimit
SalePrice
0.00211
35
.


SigLimit
SalePrice
0.00626
36
.


SigLimit
SalePrice
0.00149
37
.


SigLimit
SalePrice
0.00202
38
.


SigLimit
SalePrice
0.00524
39
.


SigLimit
SalePrice
0.00130
40
.


SigLimit
SalePrice
0.00172
41
.


SigLimit
SalePrice
0.00000
42
.


SigLimit
SalePrice
0.00009
43
.


SigLimit
SalePrice
0.00001
44
.


SigLimit
SalePrice
0.00155
45
.


SigLimit
SalePrice
0.00024
46
.


SigLimit
SalePrice
0.00197
47
.


SigLimit
SalePrice
0.00729
48
.


SigLimit
SalePrice
0.00229
49
.


SigLimit
SalePrice
0.00001
50
.


SigLimit
SalePrice
0.00162
51
.


SigLimit
SalePrice
0.00435
52
.


SigLimit
SalePrice
0.00446
53
.


SigLimit
SalePrice
0.01861
54
54


SigLimit
SalePrice
0.00062
55
.


SigLimit
SalePrice
0.00208
56
.


SigLimit
SalePrice
0.00147
57
.


SigLimit
SalePrice
0.01402
58
58


SigLimit
SalePrice
0.00083
59
.


SigLimit
SalePrice
0.00056
60
.


SigLimit
SalePrice
0.00008
61
.


SigLimit
SalePrice
0.00006
62
.


SigLimit
SalePrice
0.00090
63
.


SigLimit
SalePrice
0.00081
64
.


SigLimit
SalePrice
0.00251
65
.


SigLimit
SalePrice
0.00005
66
.


SigLimit
SalePrice
0.00129
67
.


SigLimit
SalePrice
0.01642
68
68


SigLimit
SalePrice
0.00064
69
.


SigLimit
SalePrice
0.00042
70
.


SigLimit
SalePrice
0.00018
71
.


SigLimit
SalePrice
0.00164
72
.


SigLimit
SalePrice
0.00150
73
.


SigLimit
SalePrice
0.00007
74
.


SigLimit
SalePrice
0.00012
75
.


SigLimit
SalePrice
0.01293
76
.


SigLimit
SalePrice
0.00858
77
.


SigLimit
SalePrice
0.00008
78
.


SigLimit
SalePrice
0.00157
79
.


SigLimit
SalePrice
0.00107
80
.


SigLimit
SalePrice
0.00106
81
.


SigLimit
SalePrice
0.00300
82
.


SigLimit
SalePrice
0.00364
83
.


SigLimit
SalePrice
0.00393
84
.


SigLimit
SalePrice
0.00724
85
.


SigLimit
SalePrice
0.00114
86
.


SigLimit
SalePrice
0.00135
87
.


SigLimit
SalePrice
0.00062
88
.


SigLimit
SalePrice
0.00278
89
.


SigLimit
SalePrice
0.00017
90
.


SigLimit
SalePrice
0.00254
91
.


SigLimit
SalePrice
0.00004
92
.


SigLimit
SalePrice
0.00001
93
.


SigLimit
SalePrice
0.00036
94
.


SigLimit
SalePrice
0.00158
95
.


SigLimit
SalePrice
0.00001
96
.


SigLimit
SalePrice
0.00644
97
.


SigLimit
SalePrice
0.00135
98
.


SigLimit
SalePrice
0.00008
99
.


SigLimit
SalePrice
0.00060
100
.


SigLimit
SalePrice
0.00950
101
.


SigLimit
SalePrice
0.00837
102
.


SigLimit
SalePrice
0.00512
103
.


SigLimit
SalePrice
0.00005
104
.


SigLimit
SalePrice
0.00001
105
.


SigLimit
SalePrice
0.00888
106
.


SigLimit
SalePrice
0.00142
107
.


SigLimit
SalePrice
0.00168
108
.


SigLimit
SalePrice
0.00090
109
.


SigLimit
SalePrice
0.01284
110
.


SigLimit
SalePrice
0.00048
111
.


SigLimit
SalePrice
0.00045
112
.


SigLimit
SalePrice
0.00025
113
.


SigLimit
SalePrice
0.00972
114
.


SigLimit
SalePrice
0.00424
115
.


SigLimit
SalePrice
0.00123
116
.


SigLimit
SalePrice
0.00021
117
.


SigLimit
SalePrice
0.00013
118
.


SigLimit
SalePrice
0.00239
119
.


SigLimit
SalePrice
0.00000
120
.


SigLimit
SalePrice
0.00025
121
.


SigLimit
SalePrice
0.00098
122
.


SigLimit
SalePrice
0.10918
123
123


SigLimit
SalePrice
0.00076
124
.


SigLimit
SalePrice
0.00000
125
.


SigLimit
SalePrice
0.01518
126
126


SigLimit
SalePrice
0.00146
127
.


SigLimit
SalePrice
0.00045
128
.


SigLimit
SalePrice
0.00001
129
.


SigLimit
SalePrice
0.00149
130
.


SigLimit
SalePrice
0.00000
131
.


SigLimit
SalePrice
0.00010
132
.


SigLimit
SalePrice
0.00129
133
.


SigLimit
SalePrice
0.00030
134
.


SigLimit
SalePrice
0.00001
135
.


SigLimit
SalePrice
0.00001
136
.


SigLimit
SalePrice
0.00034
137
.


SigLimit
SalePrice
0.00028
138
.


SigLimit
SalePrice
0.00000
139
.


SigLimit
SalePrice
0.00032
140
.


SigLimit
SalePrice
0.00003
141
.


SigLimit
SalePrice
0.00061
142
.


SigLimit
SalePrice
0.00011
143
.


SigLimit
SalePrice
0.00042
144
.


SigLimit
SalePrice
0.00006
145
.


SigLimit
SalePrice
0.00489
146
.


SigLimit
SalePrice
0.00750
147
.


SigLimit
SalePrice
0.00258
148
.


SigLimit
SalePrice
0.00254
149
.


SigLimit
SalePrice
0.00012
150
.


SigLimit
SalePrice
0.05626
151
151


SigLimit
SalePrice
0.00316
152
.


SigLimit
SalePrice
0.00063
153
.


SigLimit
SalePrice
0.00223
154
.


SigLimit
SalePrice
0.00002
155
.


SigLimit
SalePrice
0.00203
156
.


SigLimit
SalePrice
0.00063
157
.


SigLimit
SalePrice
0.00391
158
.


SigLimit
SalePrice
0.00010
159
.


SigLimit
SalePrice
0.00098
160
.


SigLimit
SalePrice
0.00016
161
.


SigLimit
SalePrice
0.00260
162
.


SigLimit
SalePrice
0.00028
163
.


SigLimit
SalePrice
0.00036
164
.


SigLimit
SalePrice
0.00159
165
.


SigLimit
SalePrice
0.01253
166
.


SigLimit
SalePrice
0.00196
167
.


SigLimit
SalePrice
0.01995
168
168


SigLimit
SalePrice
0.00589
169
.


SigLimit
SalePrice
0.00000
170
.


SigLimit
SalePrice
0.00033
171
.


SigLimit
SalePrice
0.00035
172
.


SigLimit
SalePrice
0.01381
173
173


SigLimit
SalePrice
0.00248
174
.


SigLimit
SalePrice
0.00224
175
.


SigLimit
SalePrice
0.00023
176
.


SigLimit
SalePrice
0.00051
177
.


SigLimit
SalePrice
0.00165
178
.


SigLimit
SalePrice
0.00095
179
.


SigLimit
SalePrice
0.00019
180
.


SigLimit
SalePrice
0.00002
181
.


SigLimit
SalePrice
0.00056
182
.


SigLimit
SalePrice
0.00035
183
.


SigLimit
SalePrice
0.00543
184
.


SigLimit
SalePrice
0.02643
185
185


SigLimit
SalePrice
0.00038
186
.


SigLimit
SalePrice
0.00038
187
.


SigLimit
SalePrice
0.00048
188
.


SigLimit
SalePrice
0.00021
189
.


SigLimit
SalePrice
0.00004
190
.


SigLimit
SalePrice
0.00017
191
.


SigLimit
SalePrice
0.00015
192
.


SigLimit
SalePrice
0.00005
193
.


SigLimit
SalePrice
0.00017
194
.


SigLimit
SalePrice
0.00029
195
.


SigLimit
SalePrice
0.00007
196
.


SigLimit
SalePrice
0.00202
197
.


SigLimit
SalePrice
0.00034
198
.


SigLimit
SalePrice
0.00001
199
.


SigLimit
SalePrice
0.00050
200
.


SigLimit
SalePrice
0.00011
201
.


SigLimit
SalePrice
0.00347
202
.


SigLimit
SalePrice
0.00289
203
.


SigLimit
SalePrice
0.00016
204
.


SigLimit
SalePrice
0.00025
205
.


SigLimit
SalePrice
0.00218
206
.


SigLimit
SalePrice
0.00159
207
.


SigLimit
SalePrice
0.00030
208
.


SigLimit
SalePrice
0.00043
209
.


SigLimit
SalePrice
0.00227
210
.


SigLimit
SalePrice
0.00623
211
.


SigLimit
SalePrice
0.00009
212
.


SigLimit
SalePrice
0.01817
213
213


SigLimit
SalePrice
0.00172
214
.


SigLimit
SalePrice
0.00034
215
.


SigLimit
SalePrice
0.00829
216
.


SigLimit
SalePrice
0.00120
217
.


SigLimit
SalePrice
0.09031
218
218


SigLimit
SalePrice
0.00260
219
.


SigLimit
SalePrice
0.00176
220
.


SigLimit
SalePrice
0.00064
221
.


SigLimit
SalePrice
0.00005
222
.


SigLimit
SalePrice
0.00014
223
.


SigLimit
SalePrice
0.00099
224
.


SigLimit
SalePrice
0.00298
225
.


SigLimit
SalePrice
0.00215
226
.


SigLimit
SalePrice
0.02530
227
227


SigLimit
SalePrice
0.00150
228
.


SigLimit
SalePrice
0.00037
229
.


SigLimit
SalePrice
0.00387
230
.


SigLimit
SalePrice
0.00641
231
.


SigLimit
SalePrice
0.00069
232
.


SigLimit
SalePrice
0.02512
233
233


SigLimit
SalePrice
0.00063
234
.


SigLimit
SalePrice
0.00112
235
.


SigLimit
SalePrice
0.00179
236
.


SigLimit
SalePrice
0.00064
237
.


SigLimit
SalePrice
0.00655
238
.


SigLimit
SalePrice
0.00589
239
.


SigLimit
SalePrice
0.05107
240
240


SigLimit
SalePrice
0.00304
241
.


SigLimit
SalePrice
0.01928
242
242


SigLimit
SalePrice
0.00119
243
.


SigLimit
SalePrice
0.00013
244
.


SigLimit
SalePrice
0.00014
245
.


SigLimit
SalePrice
0.00221
246
.


SigLimit
SalePrice
0.00037
247
.


SigLimit
SalePrice
0.00015
248
.


SigLimit
SalePrice
0.00318
249
.


SigLimit
SalePrice
0.00363
250
.


SigLimit
SalePrice
0.00134
251
.


SigLimit
SalePrice
0.00405
252
.


SigLimit
SalePrice
0.00020
253
.


SigLimit
SalePrice
0.00459
254
.


SigLimit
SalePrice
0.00000
255
.


SigLimit
SalePrice
0.00028
256
.


SigLimit
SalePrice
0.00026
257
.


SigLimit
SalePrice
0.00207
258
.


SigLimit
SalePrice
0.00016
259
.


SigLimit
SalePrice
0.00576
260
.


SigLimit
SalePrice
0.00105
261
.


SigLimit
SalePrice
0.00187
262
.


SigLimit
SalePrice
0.00062
263
.


SigLimit
SalePrice
0.00001
264
.


SigLimit
SalePrice
0.00367
265
.


SigLimit
SalePrice
0.00247
266
.


SigLimit
SalePrice
0.00232
267
.


SigLimit
SalePrice
0.00105
268
.


SigLimit
SalePrice
0.00084
269
.


SigLimit
SalePrice
0.00007
270
.


SigLimit
SalePrice
0.00002
271
.


SigLimit
SalePrice
0.00042
272
.


SigLimit
SalePrice
0.00457
273
.


SigLimit
SalePrice
0.00107
274
.


SigLimit
SalePrice
0.00063
275
.


SigLimit
SalePrice
0.00123
276
.


SigLimit
SalePrice
0.00166
277
.


SigLimit
SalePrice
0.00805
278
.


SigLimit
SalePrice
0.00000
279
.


SigLimit
SalePrice
0.00176
280
.


SigLimit
SalePrice
0.00054
281
.


SigLimit
SalePrice
0.00059
282
.


SigLimit
SalePrice
0.00152
283
.


SigLimit
SalePrice
0.00354
284
.


SigLimit
SalePrice
0.01871
285
285


SigLimit
SalePrice
0.00180
286
.


SigLimit
SalePrice
0.00000
287
.


SigLimit
SalePrice
0.00376
288
.


SigLimit
SalePrice
0.00008
289
.


SigLimit
SalePrice
0.00146
290
.


SigLimit
SalePrice
0.00020
291
.


SigLimit
SalePrice
0.02839
292
292


SigLimit
SalePrice
0.00324
293
.


SigLimit
SalePrice
0.01773
294
294


SigLimit
SalePrice
0.00026
295
.


SigLimit
SalePrice
0.00152
296
.


SigLimit
SalePrice
0.00082
297
.


SigLimit
SalePrice
0.00675
298
.


SigLimit
SalePrice
0.00850
299
.


SigLimit
SalePrice
0.00059
300
.













Model
Dependent
Observation
DFFITS
DFFITSOUT




SigLimit
SalePrice
1
0.31861
.


SigLimit
SalePrice
2
0.09029
.


SigLimit
SalePrice
3
0.12177
.


SigLimit
SalePrice
4
-0.08573
.


SigLimit
SalePrice
5
0.26928
.


SigLimit
SalePrice
6
-0.01301
.


SigLimit
SalePrice
7
.
0.37928


SigLimit
SalePrice
8
0.04366
.


SigLimit
SalePrice
9
0.19752
.


SigLimit
SalePrice
10
0.01636
.


SigLimit
SalePrice
11
0.01720
.


SigLimit
SalePrice
12
-0.16660
.


SigLimit
SalePrice
13
0.20071
.


SigLimit
SalePrice
14
0.06417
.


SigLimit
SalePrice
15
0.01742
.


SigLimit
SalePrice
16
0.10651
.


SigLimit
SalePrice
17
-0.21458
.


SigLimit
SalePrice
18
-0.08653
.


SigLimit
SalePrice
19
-0.00777
.


SigLimit
SalePrice
20
-0.01146
.


SigLimit
SalePrice
21
.
-0.33145


SigLimit
SalePrice
22
.
0.33593


SigLimit
SalePrice
23
0.16745
.


SigLimit
SalePrice
24
0.14693
.


SigLimit
SalePrice
25
-0.12508
.


SigLimit
SalePrice
26
0.08382
.


SigLimit
SalePrice
27
-0.31129
.


SigLimit
SalePrice
28
0.12730
.


SigLimit
SalePrice
29
0.06685
.


SigLimit
SalePrice
30
0.00901
.


SigLimit
SalePrice
31
-0.00958
.


SigLimit
SalePrice
32
0.07492
.


SigLimit
SalePrice
33
0.22141
.


SigLimit
SalePrice
34
0.10270
.


SigLimit
SalePrice
35
0.12977
.


SigLimit
SalePrice
36
-0.22401
.


SigLimit
SalePrice
37
-0.10918
.


SigLimit
SalePrice
38
-0.12698
.


SigLimit
SalePrice
39
-0.20493
.


SigLimit
SalePrice
40
0.10169
.


SigLimit
SalePrice
41
0.11717
.


SigLimit
SalePrice
42
0.00251
.


SigLimit
SalePrice
43
0.02662
.


SigLimit
SalePrice
44
-0.01067
.


SigLimit
SalePrice
45
0.11144
.


SigLimit
SalePrice
46
0.04339
.


SigLimit
SalePrice
47
-0.12550
.


SigLimit
SalePrice
48
-0.24193
.


SigLimit
SalePrice
49
0.13545
.


SigLimit
SalePrice
50
-0.00737
.


SigLimit
SalePrice
51
0.11391
.


SigLimit
SalePrice
52
0.18656
.


SigLimit
SalePrice
53
0.18881
.


SigLimit
SalePrice
54
.
0.38966


SigLimit
SalePrice
55
-0.07050
.


SigLimit
SalePrice
56
-0.12890
.


SigLimit
SalePrice
57
-0.10822
.


SigLimit
SalePrice
58
.
-0.33731


SigLimit
SalePrice
59
-0.08153
.


SigLimit
SalePrice
60
-0.06661
.


SigLimit
SalePrice
61
0.02521
.


SigLimit
SalePrice
62
-0.02199
.


SigLimit
SalePrice
63
0.08497
.


SigLimit
SalePrice
64
-0.08027
.


SigLimit
SalePrice
65
-0.14191
.


SigLimit
SalePrice
66
0.02079
.


SigLimit
SalePrice
67
0.10143
.


SigLimit
SalePrice
68
.
-0.36316


SigLimit
SalePrice
69
-0.07121
.


SigLimit
SalePrice
70
-0.05809
.


SigLimit
SalePrice
71
0.03742
.


SigLimit
SalePrice
72
-0.11466
.


SigLimit
SalePrice
73
0.10965
.


SigLimit
SalePrice
74
0.02413
.


SigLimit
SalePrice
75
0.03045
.


SigLimit
SalePrice
76
0.32225
.


SigLimit
SalePrice
77
-0.26234
.


SigLimit
SalePrice
78
0.02526
.


SigLimit
SalePrice
79
0.11198
.


SigLimit
SalePrice
80
0.09225
.


SigLimit
SalePrice
81
0.09207
.


SigLimit
SalePrice
82
0.15483
.


SigLimit
SalePrice
83
-0.17052
.


SigLimit
SalePrice
84
-0.17750
.


SigLimit
SalePrice
85
0.24129
.


SigLimit
SalePrice
86
-0.09562
.


SigLimit
SalePrice
87
-0.10401
.


SigLimit
SalePrice
88
-0.07030
.


SigLimit
SalePrice
89
-0.14907
.


SigLimit
SalePrice
90
-0.03647
.


SigLimit
SalePrice
91
-0.14231
.


SigLimit
SalePrice
92
0.01819
.


SigLimit
SalePrice
93
0.00707
.


SigLimit
SalePrice
94
-0.05343
.


SigLimit
SalePrice
95
-0.11219
.


SigLimit
SalePrice
96
0.01038
.


SigLimit
SalePrice
97
-0.22707
.


SigLimit
SalePrice
98
-0.10380
.


SigLimit
SalePrice
99
-0.02468
.


SigLimit
SalePrice
100
0.06905
.


SigLimit
SalePrice
101
0.27683
.


SigLimit
SalePrice
102
0.25931
.


SigLimit
SalePrice
103
-0.20256
.


SigLimit
SalePrice
104
-0.01949
.


SigLimit
SalePrice
105
-0.00829
.


SigLimit
SalePrice
106
0.26820
.


SigLimit
SalePrice
107
-0.10659
.


SigLimit
SalePrice
108
0.11590
.


SigLimit
SalePrice
109
-0.08494
.


SigLimit
SalePrice
110
-0.32111
.


SigLimit
SalePrice
111
0.06167
.


SigLimit
SalePrice
112
-0.06010
.


SigLimit
SalePrice
113
0.04432
.


SigLimit
SalePrice
114
-0.27948
.


SigLimit
SalePrice
115
-0.18438
.


SigLimit
SalePrice
116
0.09893
.


SigLimit
SalePrice
117
0.04102
.


SigLimit
SalePrice
118
0.03258
.


SigLimit
SalePrice
119
-0.13817
.


SigLimit
SalePrice
120
0.00269
.


SigLimit
SalePrice
121
-0.04475
.


SigLimit
SalePrice
122
0.08859
.


SigLimit
SalePrice
123
.
0.98452


SigLimit
SalePrice
124
0.07763
.


SigLimit
SalePrice
125
0.00134
.


SigLimit
SalePrice
126
.
0.35093


SigLimit
SalePrice
127
0.10820
.


SigLimit
SalePrice
128
-0.05994
.


SigLimit
SalePrice
129
-0.00640
.


SigLimit
SalePrice
130
-0.10923
.


SigLimit
SalePrice
131
-0.00017
.


SigLimit
SalePrice
132
0.02774
.


SigLimit
SalePrice
133
-0.10155
.


SigLimit
SalePrice
134
-0.04909
.


SigLimit
SalePrice
135
0.00891
.


SigLimit
SalePrice
136
0.01014
.


SigLimit
SalePrice
137
-0.05223
.


SigLimit
SalePrice
138
0.04689
.


SigLimit
SalePrice
139
-0.00071
.


SigLimit
SalePrice
140
0.05090
.


SigLimit
SalePrice
141
0.01550
.


SigLimit
SalePrice
142
0.06951
.


SigLimit
SalePrice
143
0.02936
.


SigLimit
SalePrice
144
0.05775
.


SigLimit
SalePrice
145
-0.02240
.


SigLimit
SalePrice
146
-0.19783
.


SigLimit
SalePrice
147
-0.24524
.


SigLimit
SalePrice
148
0.14365
.


SigLimit
SalePrice
149
0.14229
.


SigLimit
SalePrice
150
-0.03101
.


SigLimit
SalePrice
151
.
0.67740


SigLimit
SalePrice
152
0.15908
.


SigLimit
SalePrice
153
0.07076
.


SigLimit
SalePrice
154
-0.13358
.


SigLimit
SalePrice
155
0.01234
.


SigLimit
SalePrice
156
-0.12724
.


SigLimit
SalePrice
157
-0.07107
.


SigLimit
SalePrice
158
-0.17697
.


SigLimit
SalePrice
159
0.02832
.


SigLimit
SalePrice
160
-0.08861
.


SigLimit
SalePrice
161
-0.03592
.


SigLimit
SalePrice
162
-0.14433
.


SigLimit
SalePrice
163
-0.04719
.


SigLimit
SalePrice
164
0.05359
.


SigLimit
SalePrice
165
0.11252
.


SigLimit
SalePrice
166
0.31715
.


SigLimit
SalePrice
167
0.12508
.


SigLimit
SalePrice
168
.
-0.40345


SigLimit
SalePrice
169
0.21762
.


SigLimit
SalePrice
170
-0.00588
.


SigLimit
SalePrice
171
-0.05132
.


SigLimit
SalePrice
172
-0.05275
.


SigLimit
SalePrice
173
.
-0.33360


SigLimit
SalePrice
174
0.14082
.


SigLimit
SalePrice
175
0.13393
.


SigLimit
SalePrice
176
0.04265
.


SigLimit
SalePrice
177
-0.06393
.


SigLimit
SalePrice
178
0.11470
.


SigLimit
SalePrice
179
0.08713
.


SigLimit
SalePrice
180
0.03864
.


SigLimit
SalePrice
181
0.01371
.


SigLimit
SalePrice
182
-0.06683
.


SigLimit
SalePrice
183
-0.05255
.


SigLimit
SalePrice
184
0.20926
.


SigLimit
SalePrice
185
.
0.46326


SigLimit
SalePrice
186
-0.05539
.


SigLimit
SalePrice
187
-0.05492
.


SigLimit
SalePrice
188
-0.06190
.


SigLimit
SalePrice
189
0.04137
.


SigLimit
SalePrice
190
-0.01735
.


SigLimit
SalePrice
191
-0.03660
.


SigLimit
SalePrice
192
-0.03456
.


SigLimit
SalePrice
193
0.01899
.


SigLimit
SalePrice
194
-0.03724
.


SigLimit
SalePrice
195
0.04844
.


SigLimit
SalePrice
196
0.02362
.


SigLimit
SalePrice
197
0.12711
.


SigLimit
SalePrice
198
0.05203
.


SigLimit
SalePrice
199
0.00694
.


SigLimit
SalePrice
200
0.06313
.


SigLimit
SalePrice
201
0.03013
.


SigLimit
SalePrice
202
0.16668
.


SigLimit
SalePrice
203
-0.15210
.


SigLimit
SalePrice
204
-0.03623
.


SigLimit
SalePrice
205
-0.04441
.


SigLimit
SalePrice
206
-0.13201
.


SigLimit
SalePrice
207
0.11260
.


SigLimit
SalePrice
208
0.04862
.


SigLimit
SalePrice
209
-0.05867
.


SigLimit
SalePrice
210
0.13468
.


SigLimit
SalePrice
211
0.22459
.


SigLimit
SalePrice
212
0.02724
.


SigLimit
SalePrice
213
.
-0.38308


SigLimit
SalePrice
214
0.11731
.


SigLimit
SalePrice
215
-0.05246
.


SigLimit
SalePrice
216
0.25828
.


SigLimit
SalePrice
217
-0.09809
.


SigLimit
SalePrice
218
.
-0.85644


SigLimit
SalePrice
219
-0.14414
.


SigLimit
SalePrice
220
-0.11844
.


SigLimit
SalePrice
221
-0.07146
.


SigLimit
SalePrice
222
0.02033
.


SigLimit
SalePrice
223
-0.03339
.


SigLimit
SalePrice
224
0.08910
.


SigLimit
SalePrice
225
0.15452
.


SigLimit
SalePrice
226
-0.13106
.


SigLimit
SalePrice
227
.
-0.45550


SigLimit
SalePrice
228
-0.10950
.


SigLimit
SalePrice
229
0.05421
.


SigLimit
SalePrice
230
0.17610
.


SigLimit
SalePrice
231
0.22666
.


SigLimit
SalePrice
232
0.07416
.


SigLimit
SalePrice
233
.
0.44925


SigLimit
SalePrice
234
-0.07067
.


SigLimit
SalePrice
235
0.09455
.


SigLimit
SalePrice
236
-0.11961
.


SigLimit
SalePrice
237
-0.07155
.


SigLimit
SalePrice
238
0.22964
.


SigLimit
SalePrice
239
-0.21695
.


SigLimit
SalePrice
240
.
-0.64349


SigLimit
SalePrice
241
-0.15585
.


SigLimit
SalePrice
242
.
-0.39614


SigLimit
SalePrice
243
-0.09735
.


SigLimit
SalePrice
244
0.03215
.


SigLimit
SalePrice
245
-0.03339
.


SigLimit
SalePrice
246
0.13306
.


SigLimit
SalePrice
247
0.05428
.


SigLimit
SalePrice
248
0.03474
.


SigLimit
SalePrice
249
-0.15950
.


SigLimit
SalePrice
250
-0.17044
.


SigLimit
SalePrice
251
-0.10356
.


SigLimit
SalePrice
252
-0.18006
.


SigLimit
SalePrice
253
-0.04003
.


SigLimit
SalePrice
254
0.19178
.


SigLimit
SalePrice
255
-0.00342
.


SigLimit
SalePrice
256
0.04761
.


SigLimit
SalePrice
257
-0.04568
.


SigLimit
SalePrice
258
-0.12859
.


SigLimit
SalePrice
259
-0.03582
.


SigLimit
SalePrice
260
-0.21505
.


SigLimit
SalePrice
261
-0.09179
.


SigLimit
SalePrice
262
-0.12217
.


SigLimit
SalePrice
263
0.07060
.


SigLimit
SalePrice
264
-0.01052
.


SigLimit
SalePrice
265
-0.17142
.


SigLimit
SalePrice
266
0.14063
.


SigLimit
SalePrice
267
0.13637
.


SigLimit
SalePrice
268
0.09143
.


SigLimit
SalePrice
269
-0.08200
.


SigLimit
SalePrice
270
0.02370
.


SigLimit
SalePrice
271
0.01308
.


SigLimit
SalePrice
272
-0.05800
.


SigLimit
SalePrice
273
0.19101
.


SigLimit
SalePrice
274
0.09230
.


SigLimit
SalePrice
275
-0.07111
.


SigLimit
SalePrice
276
0.09890
.


SigLimit
SalePrice
277
-0.11502
.


SigLimit
SalePrice
278
-0.25482
.


SigLimit
SalePrice
279
-0.00123
.


SigLimit
SalePrice
280
-0.11854
.


SigLimit
SalePrice
281
-0.06584
.


SigLimit
SalePrice
282
0.06853
.


SigLimit
SalePrice
283
-0.11010
.


SigLimit
SalePrice
284
-0.16847
.


SigLimit
SalePrice
285
.
0.38963


SigLimit
SalePrice
286
0.12018
.


SigLimit
SalePrice
287
0.00335
.


SigLimit
SalePrice
288
0.17320
.


SigLimit
SalePrice
289
-0.02531
.


SigLimit
SalePrice
290
0.10814
.


SigLimit
SalePrice
291
0.03964
.


SigLimit
SalePrice
292
.
-0.48084


SigLimit
SalePrice
293
-0.16082
.


SigLimit
SalePrice
294
.
-0.37825


SigLimit
SalePrice
295
-0.04552
.


SigLimit
SalePrice
296
0.11036
.


SigLimit
SalePrice
297
0.08070
.


SigLimit
SalePrice
298
-0.23274
.


SigLimit
SalePrice
299
-0.26177
.


SigLimit
SalePrice
300
-0.06847
.



In [19]:

    
/*Check rows and column*/
proc print data=Dfbs;
run;









    Out[19]:









  
  
  




362  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
363  
364  /*Check rows and column*/
365  proc print data=Dfbs;
ERROR: File WORK.DFBS.DATA does not exist.
366  run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
367  ods html5 close;ods listing;

368

First, we’ll use a DATA step to create a data set named Dfbs01 from the first 300 observations of the Dfbs data set. In the next DATA step, we’ll create a data set named Dfbs02 starting with observation 301. Then we’ll combine the two new data sets by using this UPDATE statement in a DATA step, combining by observation. Let’s run these three DATA steps and take a look at the new data sets in the temporary Work library.



In [20]:

    
data Dfbs01;
   set Dfbs (obs=300);
run;

data Dfbs02;
   set Dfbs (firstobs=301);
run;

data Dfbs2;
   update Dfbs01 Dfbs02;
   by Observation;
run;









    Out[20]:









  
  
  




370  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
371  
372  data Dfbs01;
ERROR: File WORK.DFBS.DATA does not exist.
373     set Dfbs (obs=300);
374  run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DFBS01 may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
375  
376  data Dfbs02;
ERROR: File WORK.DFBS.DATA does not exist.
377     set Dfbs (firstobs=301);
378  run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DFBS02 may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
379  
380  data Dfbs2;
381     update Dfbs01 Dfbs02;
382     by Observation;
383  run;
ERROR: BY variable Observation is not on input data set WORK.DFBS01.
ERROR: BY variable Observation is not on input data set WORK.DFBS02.
ERROR: UPDATE statement needs a BY statement.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DFBS2 may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      
384  ods html5 close;ods listing;

385



In [21]:

    
proc print data = Dfbs2;
run;

proc sql number;
create table Dfbs3 as
select o.Model, o.Dependent, o.Observation, 
o._DFBETAS1, o._DFBETASOUT1,	
o._DFBETAS2, o._DFBETASOUT2, o._DFBETAS3, o._DFBETASOUT3, o._DFBETAS4, o._DFBETASOUT4 ,	
o._DFBETAS5, o._DFBETASOUT5, o._DFBETAS6, o._DFBETASOUT6,
t._DFBETAS7, t._DFBETASOUT7, t._DFBETAS8, t._DFBETASOUT8
from Dfbs01 as o inner join Dfbs02 as t
on o.observation = t.observation;
select* from Dfbs3;
run;









    Out[21]:









  
  
  




387  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
388  
389  
390  proc print data = Dfbs2;
391  run;
NOTE: No variables in data set WORK.DFBS2.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
392  
393  proc sql number;
394  create table Dfbs3 as
395  select o.Model, o.Dependent, o.Observation,
396  o._DFBETAS1, o._DFBETASOUT1,	
397  o._DFBETAS2, o._DFBETASOUT2, o._DFBETAS3, o._DFBETASOUT3, o._DFBETAS4, o._DFBETASOUT4 ,	
398  o._DFBETAS5, o._DFBETASOUT5, o._DFBETAS6, o._DFBETASOUT6,
399  t._DFBETAS7, t._DFBETASOUT7, t._DFBETAS8, t._DFBETASOUT8
400  from Dfbs01 as o inner join Dfbs02 as t
401  on o.observation = t.observation;
ERROR: Table WORK.DFBS01 doesn't have any columns. PROC SQL requires each of its tables to have at least 1 column.
ERROR: Table WORK.DFBS02 doesn't have any columns. PROC SQL requires each of its tables to have at least 1 column.
ERROR: Column observation could not be found in the table/view identified with the correlation name O.
ERROR: Column observation could not be found in the table/view identified with the correlation name O.
ERROR: Column observation could not be found in the table/view identified with the correlation name T.
ERROR: Column observation could not be found in the table/view identified with the correlation name T.
ERROR: Column Model could not be found in the table/view identified with the correlation name O.
ERROR: Column Dependent could not be found in the table/view identified with the correlation name O.
ERROR: Column Observation could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS1 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT1 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS2 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT2 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS3 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT3 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS4 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT4 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS5 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT5 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS6 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETASOUT6 could not be found in the table/view identified with the correlation name O.
ERROR: Column _DFBETAS7 could not be found in the table/view identified with the correlation name T.
ERROR: Column _DFBETASOUT7 could not be found in the table/view identified with the correlation name T.
ERROR: Column _DFBETAS8 could not be found in the table/view identified with the correlation name T.
ERROR: Column _DFBETASOUT8 could not be found in the table/view identified with the correlation name T.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
402  select* from Dfbs3;
ERROR: File WORK.DFBS3.DATA does not exist.
NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
403  run;
404  ods html5 close;ods listing;

405



In [22]:

    
data influential;
/*  Merge data sets from above.*/
    merge Rstud
          Cook 
          Dffits
          Dfbs2;
    by observation;

/*  Flag observations that have exceeded at least one cutpoint;*/
   if (ABS(Rstudent)>3) or (Cooksdlabel ne ' ') or Dffitsout then flag=1;
   array dfbetas{*} _dfbetasout: ;
   do i=2 to dim(dfbetas);
      if dfbetas{i} then flag=1;
   end;

/*  Set to missing values of influence statistics for those*/
/*  that have not exceeded cutpoints;*/
   if ABS(Rstudent)<=3 then RStudent=.;
   if Cooksdlabel eq ' ' then CooksD=.;

/*  Subset only observations that have been flagged.*/
   if flag=1;
   drop i flag;
run;

title;
proc print data=influential;
   id observation;
   var Rstudent CooksD Dffitsout _dfbetasout:; 
run;









    Out[22]:









  
  
  




407  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
408  
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.20 seconds
      cpu time            0.16 seconds
      
409  data influential;
410  /*  Merge data sets from above.*/
411      merge Rstud
412            Cook
413            Dffits
414            Dfbs2;
415      by observation;
416  
417  /*  Flag observations that have exceeded at least one cutpoint;*/
418     if (ABS(Rstudent)>3) or (Cooksdlabel ne ' ') or Dffitsout then flag=1;
WARNING: Defining an array with zero elements.
419     array dfbetas{*} _dfbetasout: ;
420     do i=2 to dim(dfbetas);
421        if dfbetas{i} then flag=1;
422     end;
423  
424  /*  Set to missing values of influence statistics for those*/
425  /*  that have not exceeded cutpoints;*/
426     if ABS(Rstudent)<=3 then RStudent=.;
427     if Cooksdlabel eq ' ' then CooksD=.;
428  
429  /*  Subset only observations that have been flagged.*/
430     if flag=1;
431     drop i flag;
432  run;
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      418:44   427:22   
ERROR: BY variable Observation is not on input data set WORK.DFBS2.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.INFLUENTIAL may be incomplete.  When this step was stopped there were 0 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
433  
434  title;
435  proc print data=influential;
436     id observation;
WARNING: No variables found beginning with '_DFBETASOUT' in data set WORK.INFLUENTIAL.
437     var Rstudent CooksD Dffitsout _dfbetasout:;
438  run;
NOTE: No observations in data set WORK.INFLUENTIAL.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
439  ods html5 close;ods listing;

440



In [23]:

    
title "Displaying Influential Observations";
proc reg data=exercise plots(only) = (cooksd(label)
rstudentbypredicted(label));
id Subj;
model Pushups = Rest_Pulse / influence r;
run;
quit;









    Out[23]:









  
  
  




442  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
443  
444  title "Displaying Influential Observations";
445  proc reg data=exercise plots(only) = (cooksd(label)
ERROR: File WORK.EXERCISE.DATA does not exist.
446  rstudentbypredicted(label));
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
447  id Subj;
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
448  model Pushups = Rest_Pulse / influence r;
449  run;
NOTE: PROCEDURE REG used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
450  quit;
451  
452  ods html5 close;ods listing;

453



In [24]:

    
ods graphics on;
title "Detecting Influential Observations in Multiple Regression";
proc reg data=exercise 
    plots(label only) = (cooksd
    rstudentbypredicted
    dffits
    dfbetas);
id Subj;
model Pushups = Age Max_Pulse Run_Pulse / influence;
run;
quit;
ods graphics off;









    Out[24]:









  
  
  




455  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
456  
457  ods graphics on;
458  title "Detecting Influential Observations in Multiple Regression";
459  proc reg data=exercise
ERROR: File WORK.EXERCISE.DATA does not exist.
460      plots(label only) = (cooksd
461      rstudentbypredicted
462      dffits
463      dfbetas);
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
464  id Subj;
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
465  model Pushups = Age Max_Pulse Run_Pulse / influence;
466  run;
NOTE: PROCEDURE REG used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
467  quit;
468  ods graphics off;
469  ods html5 close;ods listing;

470

Creating Dummy Variables



In [25]:

    
data Dummy;
    set Store;
    *Create dummy variable for Gender;
        if Gender = 'Male' then Male = 1;
        else if Gender = 'Female' then Male = 0;
    *Create Dummy Variable for Region;
        if Region not in ('North' 'East' 'South' 'West') then
            call missing(North, East, South);
            else if Region = 'North' then North = 1;
        else North = 0;
        if Region = 'East' then East = 1;
            else East = 0;
        if Region = 'South' then South = 1;
            else South = 0;
run;

title "Creating and Using Dummy variables";
proc print data=Dummy(obs=10) noobs;
    var Region Gender Male North East South;
run









    Out[25]:









  
  
  




472  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
473  
474  data Dummy;
ERROR: File WORK.STORE.DATA does not exist.
475      set Store;
476      *Create dummy variable for Gender;
477          if Gender = 'Male' then Male = 1;
478          else if Gender = 'Female' then Male = 0;
479      *Create Dummy Variable for Region;
480          if Region not in ('North' 'East' 'South' 'West') then
481              call missing(North, East, South);
482              else if Region = 'North' then North = 1;
483          else North = 0;
484          if Region = 'East' then East = 1;
485              else East = 0;
486          if Region = 'South' then South = 1;
487              else South = 0;
488  run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DUMMY may be incomplete.  When this step was stopped there were 0 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
489  
490  title "Creating and Using Dummy variables";
491  proc print data=Dummy(obs=10) noobs;
492      var Region Gender Male North East South;
493  run
494  ;
NOTE: No observations in data set WORK.DUMMY.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
494!  *';*";*/;ods html5 close;ods listing;

495



In [26]:

    
title "Running a Multiple Regression with Dummy Variables";
proc reg data=Dummy;
model Music_Sales = Total_Sales Male North East South;
run;
quit;









    Out[26]:









  
  
  




497  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
498  
499  title "Running a Multiple Regression with Dummy Variables";
500  proc reg data=Dummy;
ERROR: Variable MUSIC_SALES not found.
ERROR: Variable TOTAL_SALES not found.
NOTE: The previous statement has been deleted.
501  model Music_Sales = Total_Sales Male North East South;
502  run;
WARNING: No variables specified for an SSCP matrix. Execution terminating.
NOTE: PROCEDURE REG used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      
503  quit;
504  ods html5 close;ods listing;

505

Detecting Collinearity via Variance Inflation Factor

(pay attention when VIF is between 5 and 10)



In [27]:

    
title "Using the VIF to Detect Collinearity";
proc reg data=exercise;
    model Pushups = Age Rest_Pulse Max_Pulse Run_Pulse / VIF;
run;
quit;









    Out[27]:









  
  
  




507  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
508  
509  title "Using the VIF to Detect Collinearity";
510  proc reg data=exercise;
ERROR: File WORK.EXERCISE.DATA does not exist.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
NOTE: The previous statement has been deleted.
511      model Pushups = Age Rest_Pulse Max_Pulse Run_Pulse / VIF;
512  run;
NOTE: PROCEDURE REG used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
513  quit;
514  ods html5 close;ods listing;

515

In the PLOTS= option, the global plot option ONLY suppresses the default plots. QQ requests a residual quantile-quantile plot to assess the normality of the residual error, and RESIDUALBYPREDICTED requests a plot of residuals by predicted values. RESIDUALS requests a panel of plots of residuals by the predictor variables in the model.



In [28]:

    
ods graphics / imagemap=on width=800;

proc reg data=statdata.fitness
         plots(only)=(QQ RESIDUALBYPREDICTED RESIDUALS); 
   PREDICT: model Oxygen_Consumption =
                  RunTime Age Run_Pulse Maximum_Pulse; 
   id Name; 
   title 'PREDICT Model - Plots of Diagnostic Statistics';
run;
quit;

title;









    Out[28]:










SAS Output





PREDICT Model - Plots of Diagnostic Statistics 


The REG Procedure
Model: PREDICT
Dependent Variable: Oxygen_Consumption










Number of Observations Read
31


Number of Observations Used
31










Analysis of Variance


Source
DF
Sum of
Squares

Mean
Square

F Value
Pr > F




Model
4
711.45087
177.86272
33.01
<.0001


Error
26
140.10368
5.38860
 
 


Corrected Total
30
851.55455
 
 
 










Root MSE
2.32134
R-Square
0.8355


Dependent Mean
47.37581
Adj R-Sq
0.8102


Coeff Var
4.89984
 
 










Parameter Estimates


Variable
DF
Parameter
Estimate

Standard
Error

t Value
Pr > |t|




Intercept
1
97.16952
11.65703
8.34
<.0001


RunTime
1
-2.77576
0.34159
-8.13
<.0001


Age
1
-0.18903
0.09439
-2.00
0.0557


Run_Pulse
1
-0.34568
0.11820
-2.92
0.0071


Maximum_Pulse
1
0.27188
0.13438
2.02
0.0534








PREDICT Model - Plots of Diagnostic Statistics 


The REG Procedure
Model: PREDICT
Dependent Variable: Oxygen_Consumption

Code for SAS Statistic by example



In [29]:

    
*variables

Region
Advertising
Gender
Book_Sales
Music_Sales
Electronics_Sales
Total_Sales
;

proc format;
   value yesno 1 = 'Yes'
               0 = 'No';
data Store;
   length Region $ 5;
   call streaminit(57676);
   do Transaction = 1 to 200;
      R = ceil(rand('uniform')*10);
      select(R);
         when(1) Region = 'East';
         when(2) Region = 'West';
         when(3) Region = 'North';
         when(4) Region = 'South';
         otherwise;
      end;
      Advertising = rand('bernouli',.6);
      if rand('uniform') lt .6 then Gender = 'Female';
         else Gender = 'Male';
      Book_Sales = abs(round(rand('normal',250,50) + 30*(Gender = 'Female')
                    + 30*Advertising,10)) ;
      Music_Sales = abs(round(rand('uniform')*40 + rand('normal',50,5)
         + 30*(Region = 'East' and Gender = 'Male')
         - 20*(Region = 'West' and Gender = 'Female'),5) + 10*Advertising);
      Electronics_Sales = abs(round(rand('normal',300,60) + 70*(Gender = 'Male')
       + 55*Advertising + 50*(Region = 'East') - 20*(Region = 'South') 
       + 75*(Region = 'West'),10));
      Total_Sales = sum(Book_Sales,Music_Sales,Electronics_Sales);
   output;
   end;
   drop R;
   format Book_Sales Music_Sales Electronics_Sales Total_Sales dollar9.
          Advertising yesno.;
run;
 
/*title "Listing of Store";*/
/*proc print data=store heading=h;*/
/*run;*/

/*proc univariate data=store;*/
/*   var Book_Sales -- Total_Sales;*/
/*   histogram;*/
/*run;*/
/**/
/*title "Scatter Matrix for Store Variables";*/
/*proc sgscatter data=store;*/
/*   matrix Book_Sales -- Total_Sales / group = Gender;*/
/*run;*/
/**/
/*proc sgplot data=store;*/
/*   scatter x=Book_Sales y=Total_Sales / group=Gender;*/
/*run;*/

proc rank data=store out=median_sales groups=2;
   var Total_Sales;
   ranks Sales_Group;
run;

proc format;
   value sales 0 = 'Low'
               1 = 'High';
run;

/*proc logistic data=median_sales order=formatted;*/
/*   class Gender(param=ref ref='Male');*/
/*   model Sales_Group = Gender;*/
/*   format Sales_Group sales.;*/
/*quit;*/
/**/
/*proc logistic data=median_sales order=formatted;*/
/*   class Gender(param=ref ref='Male')*/
/*         Advertising (param=ref ref='No');*/
/*   model Sales_Group = Gender Advertising;*/
/*   format Sales_Group sales.;*/
/*quit;*/

*Create test data set;
libname example 'c:\books\statistics by example';
data example.Blood_Pressure;
   call streaminit(37373);
   do Drug = 'Placebo','Drug A','Drug B';
      do i = 1 to 20;
         Subj + 1;
         if mod(Subj,2) then Gender = 'M';
         else Gender = 'F';
         SBP = rand('normal',130,10) +
               7*(Drug eq 'Placebo') - 6*(Drug eq 'Drug B');
         SBP = round(SBP,2);
         DBP = rand('normal',80,5) +
               3*(Drug eq 'Placebo') - 2*(Drug eq 'Drug B');
         DBP = round(DBP,2);
         if Subj in (5,15,25,55) then call missing(SBP, DBP);
         if Subj in (4,18) then call missing(Gender);
         output;
      end;
   end;
   drop i;
run;

/*title "Listing of the first 25 observations from Blood_Pressure";*/
/*proc print data=example.Blood_Pressure(obs=25) noobs;*/
/*   var Subj Drug SBP DBP;*/
/*run;*/

data exercise;
   call streaminit(7657657);
   do Subj = 1 to 50;
      Age = round(rand('normal',50,15));
      Pushups = abs(int(rand('normal',40,10) - .30*age));
      Rest_Pulse = round(rand('normal',50,8) + .35*age);
      Max_Pulse = round(rest_pulse + rand('normal',50,5) - .05*age);
      Run_Pulse = round(max_pulse - rand('normal',3,3));
      output;
   end;
run;

*Data set for a paired t-test example;
data reading;
   input Subj Before After @@;
datalines;
1 100 110  2 120 121  3 130 140  4 90 110  5 85 92
6 133 137  7 210 209  8 155 179
;

/*title "Listing of Data Set READING";*/
/*proc print data=reading noobs;*/
/*run;*/

*Data set that violates assumptions for a t-test;
data salary;
   call streaminit(57575);
   do Subj = 1 to 50;
      do Gender = 'M','F';
         Income = round(20000*rand('exponential') + rand('uniform')*7000*(Gender = 'M'));
         output;
      end;
   end;
run;
/*proc univariate data=salary;*/
/*   class Gender;*/
/*   id Subj;*/
/*   var Income;*/
/*   histogram Income;*/
/*run;*/

*Data set risk for logistic regression example;
proc format;
   value yesno 1 = 'Yes'
               0 = 'No';
run;

data Risk;
   call streaminit(13579);
   length Age_Group $ 7;
   do i = 1 to 250;
      do Gender = 'F','M';
         Age = round(rand('uniform')*30 + 50);
         if missing(Age) then Age_Group = ' ';
         else if Age lt 60 then Age_Group = '1:< 60';
         else if Age le 70 then Age_Group = '2:60-70';
         else Age_Group = '3:71+';
         Chol = rand('normal',200,30) + rand('uniform')*8*(Gender='M');
         Chol = round(Chol);
         Score = .3*chol + age + 8*(Gender eq 'M');
         Heart_Attack = (Score gt 130)*(rand('uniform') lt .2);
         output;
       end;
   end;
   keep Gender Age Age_Group chol Heart_Attack;
   format Heart_Attack yesno.;
run;

/*title "Listing of first 100 observations from RISK";*/
/*proc print data=risk(obs=100);*/
/*run;*/









    Out[29]:









  
  
  




534  ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
535  
536  *variables
537  
538  Region
539  Advertising
540  Gender
541  Book_Sales
542  Music_Sales
543  Electronics_Sales
544  Total_Sales
545  ;
546  
547  proc format;
548     value yesno 1 = 'Yes'
NOTE: Format YESNO has been output.
549                 0 = 'No';
NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      
550  data Store;
551     length Region $ 5;
552     call streaminit(57676);
553     do Transaction = 1 to 200;
554        R = ceil(rand('uniform')*10);
555        select(R);
556           when(1) Region = 'East';
557           when(2) Region = 'West';
558           when(3) Region = 'North';
559           when(4) Region = 'South';
560           otherwise;
561        end;
562        Advertising = rand('bernouli',.6);
563        if rand('uniform') lt .6 then Gender = 'Female';
564           else Gender = 'Male';
565        Book_Sales = abs(round(rand('normal',250,50) + 30*(Gender = 'Female')
566                      + 30*Advertising,10)) ;
567        Music_Sales = abs(round(rand('uniform')*40 + rand('normal',50,5)
568           + 30*(Region = 'East' and Gender = 'Male')
569           - 20*(Region = 'West' and Gender = 'Female'),5) + 10*Advertising);
570        Electronics_Sales = abs(round(rand('normal',300,60) + 70*(Gender = 'Male')
571         + 55*Advertising + 50*(Region = 'East') - 20*(Region = 'South')
572         + 75*(Region = 'West'),10));
573        Total_Sales = sum(Book_Sales,Music_Sales,Electronics_Sales);
574     output;
575     end;
576     drop R;
577     format Book_Sales Music_Sales Electronics_Sales Total_Sales dollar9.
578            Advertising yesno.;
579  run;
NOTE: The data set WORK.STORE has 200 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
      
580  
581  /*title "Listing of Store";*/
582  /*proc print data=store heading=h;*/
583  /*run;*/
584  
585  /*proc univariate data=store;*/
586  /*   var Book_Sales -- Total_Sales;*/
587  /*   histogram;*/
588  /*run;*/
589  /**/
590  /*title "Scatter Matrix for Store Variables";*/
591  /*proc sgscatter data=store;*/
592  /*   matrix Book_Sales -- Total_Sales / group = Gender;*/
593  /*run;*/
594  /**/
595  /*proc sgplot data=store;*/
596  /*   scatter x=Book_Sales y=Total_Sales / group=Gender;*/
597  /*run;*/
598  
599  proc rank data=store out=median_sales groups=2;
600     var Total_Sales;
601     ranks Sales_Group;
602  run;
NOTE: The data set WORK.MEDIAN_SALES has 200 observations and 9 variables.
NOTE: PROCEDURE RANK used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      
603  
604  proc format;
605     value sales 0 = 'Low'
NOTE: Format SALES has been output.
606                 1 = 'High';
607  run;
NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      
608  
609  /*proc logistic data=median_sales order=formatted;*/
610  /*   class Gender(param=ref ref='Male');*/
611  /*   model Sales_Group = Gender;*/
612  /*   format Sales_Group sales.;*/
613  /*quit;*/
614  /**/
615  /*proc logistic data=median_sales order=formatted;*/
616  /*   class Gender(param=ref ref='Male')*/
617  /*         Advertising (param=ref ref='No');*/
618  /*   model Sales_Group = Gender Advertising;*/
619  /*   format Sales_Group sales.;*/
620  /*quit;*/
621  
622  *Create test data set;
623  libname example 'c:\books\statistics by example';
NOTE: Library EXAMPLE does not exist.
624  data example.Blood_Pressure;
625     call streaminit(37373);
626     do Drug = 'Placebo','Drug A','Drug B';
627        do i = 1 to 20;
628           Subj + 1;
629           if mod(Subj,2) then Gender = 'M';
630           else Gender = 'F';
631           SBP = rand('normal',130,10) +
632                 7*(Drug eq 'Placebo') - 6*(Drug eq 'Drug B');
633           SBP = round(SBP,2);
634           DBP = rand('normal',80,5) +
635                 3*(Drug eq 'Placebo') - 2*(Drug eq 'Drug B');
636           DBP = round(DBP,2);
637           if Subj in (5,15,25,55) then call missing(SBP, DBP);
638           if Subj in (4,18) then call missing(Gender);
639           output;
640        end;
641     end;
642     drop i;
643  run;
ERROR: Library EXAMPLE does not exist.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
644  
645  /*title "Listing of the first 25 observations from Blood_Pressure";*/
646  /*proc print data=example.Blood_Pressure(obs=25) noobs;*/
647  /*   var Subj Drug SBP DBP;*/
648  /*run;*/
649  
650  data exercise;
651     call streaminit(7657657);
652     do Subj = 1 to 50;
653        Age = round(rand('normal',50,15));
654        Pushups = abs(int(rand('normal',40,10) - .30*age));
655        Rest_Pulse = round(rand('normal',50,8) + .35*age);
656        Max_Pulse = round(rest_pulse + rand('normal',50,5) - .05*age);
657        Run_Pulse = round(max_pulse - rand('normal',3,3));
658        output;
659     end;
660  run;
NOTE: The data set WORK.EXERCISE has 50 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
661  
662  *Data set for a paired t-test example;
663  data reading;
664     input Subj Before After @@;
665  datalines;
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.READING has 8 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
668  ;
669  
670  /*title "Listing of Data Set READING";*/
671  /*proc print data=reading noobs;*/
672  /*run;*/
673  
674  *Data set that violates assumptions for a t-test;
675  data salary;
676     call streaminit(57575);
677     do Subj = 1 to 50;
678        do Gender = 'M','F';
679           Income = round(20000*rand('exponential') + rand('uniform')*7000*(Gender = 'M'));
680           output;
681        end;
682     end;
683  run;
NOTE: The data set WORK.SALARY has 100 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      
684  /*proc univariate data=salary;*/
685  /*   class Gender;*/
686  /*   id Subj;*/
687  /*   var Income;*/
688  /*   histogram Income;*/
689  /*run;*/
690  
691  *Data set risk for logistic regression example;
692  proc format;
693     value yesno 1 = 'Yes'
NOTE: Format YESNO is already on the library WORK.FORMATS.
NOTE: Format YESNO has been output.
694                 0 = 'No';
695  run;
NOTE: PROCEDURE FORMAT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
696  
697  data Risk;
698     call streaminit(13579);
699     length Age_Group $ 7;
700     do i = 1 to 250;
701        do Gender = 'F','M';
702           Age = round(rand('uniform')*30 + 50);
703           if missing(Age) then Age_Group = ' ';
704           else if Age lt 60 then Age_Group = '1:< 60';
705           else if Age le 70 then Age_Group = '2:60-70';
706           else Age_Group = '3:71+';
707           Chol = rand('normal',200,30) + rand('uniform')*8*(Gender='M');
708           Chol = round(Chol);
709           Score = .3*chol + age + 8*(Gender eq 'M');
710           Heart_Attack = (Score gt 130)*(rand('uniform') lt .2);
711           output;
712         end;
713     end;
714     keep Gender Age Age_Group chol Heart_Attack;
715     format Heart_Attack yesno.;
716  run;
NOTE: The data set WORK.RISK has 500 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      
717  
718  /*title "Listing of first 100 observations from RISK";*/
719  /*proc print data=risk(obs=100);*/
720  /*run;*/
721  ods html5 close;ods listing;

722

Obs	Name	Gender	RunTime	Age	Weight	Oxygen_Consumption	Run_Pulse	Rest_Pulse	Maximum_Pulse	Performance
1	Donna	F	8.17	42	68.15	59.57	166	40	172	90
2	Gracie	F	8.63	38	81.87	60.06	170	48	186	94
3	Luanne	F	8.65	43	85.84	54.30	156	45	168	83
4	Mimi	F	8.92	50	70.87	54.63	146	48	155	67
5	Chris	M	8.95	49	81.42	49.16	180	44	185	72
6	Allen	M	9.22	38	89.02	49.87	178	55	180	92
7	Nancy	F	9.40	49	76.32	48.67	186	56	188	64
8	Patty	F	9.63	52	76.32	45.44	164	48	166	56
9	Suzanne	F	9.93	57	59.08	50.55	148	49	155	43
10	Teresa	F	10.00	51	77.91	46.67	162	48	168	54
11	Bob	M	10.07	40	75.07	45.31	185	62	185	79
12	Harriett	F	10.08	49	73.37	50.39	168	67	168	57
13	Jane	F	10.13	44	73.03	50.54	168	45	168	67
14	Harold	M	10.25	48	91.63	46.77	162	48	164	61
15	Sammy	M	10.33	54	83.12	51.85	166	50	170	49
16	Buffy	F	10.47	52	73.71	45.79	186	59	188	47
17	Trent	M	10.50	52	82.78	47.47	170	53	172	51
18	Jackie	F	10.60	47	79.15	47.27	162	47	164	56
19	Ralph	M	10.85	43	81.19	49.09	162	64	170	65
20	Jack	M	10.95	51	69.63	40.84	168	57	172	48
21	Annie	F	11.08	51	67.25	45.12	172	48	172	43
22	Kate	F	11.12	45	66.45	44.75	176	51	176	55
23	Carl	M	11.17	54	79.38	46.08	156	62	165	40
24	Don	M	11.37	44	89.47	44.61	178	62	182	58
25	Effie	F	11.50	48	61.24	47.92	170	52	176	45
26	George	M	11.63	47	77.45	44.81	176	58	176	50
27	Iris	F	11.95	40	75.98	45.68	176	70	180	56
28	Mark	M	12.63	57	73.37	39.41	174	58	176	20
29	Steve	M	12.88	54	91.63	39.20	168	44	172	23
30	Vaughn	M	13.08	44	81.42	39.44	174	63	176	41
31	William	M	14.03	45	87.66	37.39	186	56	192	30

1 With Variables:	Oxygen_Consumption
7 Variables:	RunTime Age Weight Run_Pulse Rest_Pulse Maximum_Pulse Performance

Simple Statistics
Variable	N	Mean	Std Dev	Median	Minimum	Maximum
Oxygen_Consumption	31	47.37581	5.32777	46.77000	37.39000	60.06000
RunTime	31	10.58613	1.38741	10.47000	8.17000	14.03000
Age	31	47.67742	5.26236	48.00000	38.00000	57.00000
Weight	31	77.44452	8.32857	77.45000	59.08000	91.63000
Run_Pulse	31	169.64516	10.25199	170.00000	146.00000	186.00000
Rest_Pulse	31	53.45161	7.61944	52.00000	40.00000	70.00000
Maximum_Pulse	31	173.77419	9.16410	172.00000	155.00000	192.00000
Performance	31	56.64516	18.32584	56.00000	20.00000	94.00000

Model Crossproducts X'X X'Y Y'Y
Variable	Intercept	RunTime	Oxygen_Consumption
Intercept	31	328.17	1468.65
RunTime	328.17	3531.7975	15356.1247
Oxygen_Consumption	1468.65	15356.1247	70430.0327

Analysis of Variance
Source	DF	Sum of Squares	Mean Square	F Value	Pr > F
Model	1	633.01458	633.01458	84.00	<.0001
Error	29	218.53997	7.53586
Corrected Total	30	851.55455

Root MSE	2.74515	R-Square	0.7434
Dependent Mean	47.37581	Adj R-Sq	0.7345
Coeff Var	5.79442

Parameter Estimates
Variable	DF	Parameter Estimate	Standard Error	t Value	Pr > \|t\|
Intercept	1	82.42494	3.85582	21.38	<.0001
RunTime	1	-3.31085	0.36124	-9.17	<.0001

Output Statistics
Obs	Name	RunTime	Dependent Variable	Predicted Value	Std Error Mean Predict					Residual	Std Error Residual	Student Residual	Cook's D	RStudent	Hat Diag H	Cov Ratio	DFFITS	DFBETAS
Obs	Name	RunTime	Dependent Variable	Predicted Value	Std Error Mean Predict	95% CL Mean		95% CL Predict		Residual	Std Error Residual	Student Residual	Cook's D	RStudent	Hat Diag H	Cov Ratio	DFFITS	Intercept	RunTime
1	Donna	8.17	59.6	55.3753	1.0024	53.3250	57.4255	49.3982	61.3524	4.1947	2.556	1.641	0.207	1.6934	0.1333	1.0185	0.6643	0.6154	-0.5784
2	Gracie	8.63	60.1	53.8523	0.8616	52.0900	55.6145	47.9677	59.7368	6.2077	2.606	2.382	0.310	2.6094	0.0985	0.7700	0.8626	0.7647	-0.7074
3	Luanne	8.65	54.3	53.7860	0.8557	52.0359	55.5362	47.9051	59.6670	0.5140	2.608	0.197	0.002	0.1937	0.0972	1.1850	0.0636	0.0562	-0.0520
4	Mimi	8.92	54.6	52.8921	0.7780	51.3008	54.4834	47.0565	58.7277	1.7379	2.633	0.660	0.019	0.6536	0.0803	1.1316	0.1932	0.1639	-0.1494
5	Chris	8.95	49.2	52.7928	0.7697	51.2186	54.3670	46.9618	58.6238	-3.6328	2.635	-1.379	0.081	-1.4014	0.0786	1.0166	-0.4093	-0.3453	0.3143
6	Allen	9.22	49.9	51.8989	0.6976	50.4721	53.3256	46.1059	57.6918	-2.0289	2.655	-0.764	0.020	-0.7585	0.0646	1.1010	-0.1993	-0.1578	0.1410
7	Nancy	9.40	48.7	51.3029	0.6532	49.9669	52.6389	45.5317	57.0741	-2.6329	2.666	-0.987	0.029	-0.9870	0.0566	1.0619	-0.2418	-0.1807	0.1586
8	Patty	9.63	45.4	50.5414	0.6020	49.3102	51.7726	44.7935	56.2893	-5.1014	2.678	-1.905	0.092	-2.0009	0.0481	0.8626	-0.4497	-0.3030	0.2580
9	Suzanne	9.93	50.6	49.5482	0.5471	48.4293	50.6670	43.8233	55.2730	1.0018	2.690	0.372	0.003	0.3668	0.0397	1.1064	0.0746	0.0407	-0.0323
10	Teresa	10.00	46.7	49.3164	0.5366	48.2190	50.4138	43.5957	55.0371	-2.6464	2.692	-0.983	0.019	-0.9824	0.0382	1.0422	-0.1958	-0.0996	0.0773
11	Bob	10.07	45.3	49.0846	0.5271	48.0066	50.1627	43.3676	54.8017	-3.7746	2.694	-1.401	0.038	-1.4258	0.0369	0.9681	-0.2790	-0.1312	0.0987
12	Harriett	10.08	50.4	49.0515	0.5259	47.9760	50.1270	43.3350	54.7681	1.3385	2.694	0.497	0.005	0.4902	0.0367	1.0947	0.0957	0.0445	-0.0333
13	Jane	10.13	50.5	48.8860	0.5198	47.8228	49.9492	43.1717	54.6002	1.6540	2.695	0.614	0.007	0.6069	0.0359	1.0839	0.1170	0.0510	-0.0371
14	Harold	10.25	46.8	48.4887	0.5078	47.4502	49.5272	42.7790	54.1984	-1.7187	2.698	-0.637	0.007	-0.6304	0.0342	1.0798	-0.1187	-0.0429	0.0284
15	Sammy	10.33	51.9	48.2238	0.5017	47.1978	49.2498	42.5164	53.9313	3.6262	2.699	1.344	0.031	1.3633	0.0334	0.9759	0.2534	0.0782	-0.0467
16	Buffy	10.47	45.8	47.7603	0.4948	46.7483	48.7723	42.0553	53.4652	-1.9703	2.700	-0.730	0.009	-0.7237	0.0325	1.0684	-0.1326	-0.0280	0.0112
17	Trent	10.50	47.5	47.6610	0.4940	46.6506	48.6714	41.9563	53.3656	-0.1910	2.700	-0.071	0.000	-0.0695	0.0324	1.1082	-0.0127	-0.0024	0.0008
18	Jackie	10.60	47.3	47.3299	0.4931	46.3214	48.3383	41.6256	53.0342	-0.0599	2.701	-0.022	0.000	-0.0218	0.0323	1.1084	-0.0040	-0.0005	-0.0000
19	Ralph	10.85	49.1	46.5022	0.5022	45.4751	47.5292	40.7945	52.2098	2.5878	2.699	0.959	0.016	0.9575	0.0335	1.0406	0.1782	-0.0112	0.0338
20	Jack	10.95	40.8	46.1711	0.5103	45.1275	47.2147	40.4604	51.8817	-5.3311	2.697	-1.976	0.070	-2.0878	0.0346	0.8319	-0.3950	0.0521	-0.1017
21	Annie	11.08	45.1	45.7407	0.5243	44.6683	46.8130	40.0247	51.4566	-0.6207	2.695	-0.230	0.001	-0.2265	0.0365	1.1093	-0.0441	0.0096	-0.0150
22	Kate	11.12	44.8	45.6082	0.5294	44.5255	46.6910	39.8903	51.3262	-0.8582	2.694	-0.319	0.002	-0.3136	0.0372	1.1064	-0.0616	0.0149	-0.0225
23	Carl	11.17	46.1	45.4427	0.5363	44.3459	46.5395	39.7221	51.1633	0.6373	2.692	0.237	0.001	0.2328	0.0382	1.1110	0.0464	-0.0126	0.0182
24	Don	11.37	44.6	44.7805	0.5686	43.6177	45.9434	39.0469	50.5142	-0.1705	2.686	-0.063	0.000	-0.0624	0.0429	1.1205	-0.0132	0.0051	-0.0066
25	Effie	11.50	47.9	44.3501	0.5934	43.1366	45.5637	38.6060	50.0942	3.5699	2.680	1.332	0.043	1.3507	0.0467	0.9918	0.2990	-0.1332	0.1664
26	George	11.63	44.8	43.9197	0.6207	42.6502	45.1892	38.1635	49.6759	0.8903	2.674	0.333	0.003	0.3278	0.0511	1.1219	0.0761	-0.0381	0.0462
27	Iris	11.95	45.7	42.8602	0.6970	41.4347	44.2858	37.0676	48.6528	2.8198	2.655	1.062	0.039	1.0644	0.0645	1.0592	0.2794	-0.1706	0.1975
28	Mark	12.63	39.4	40.6088	0.8878	38.7930	42.4246	34.7081	46.5096	-1.1988	2.598	-0.462	0.012	-0.4552	0.1046	1.1805	-0.1556	0.1173	-0.1294
29	Steve	12.88	39.2	39.7811	0.9642	37.8091	41.7532	33.8304	45.7319	-0.5811	2.570	-0.226	0.004	-0.2224	0.1234	1.2194	-0.0834	0.0656	-0.0717
30	Vaughn	13.08	39.4	39.1190	1.0270	37.0185	41.2194	33.1245	45.1135	0.3210	2.546	0.126	0.001	0.1239	0.1400	1.2459	0.0500	-0.0404	0.0439
31	William	14.03	37.4	35.9736	1.3382	33.2367	38.7106	29.7276	42.2197	1.4164	2.397	0.591	0.054	0.5842	0.2376	1.3734	0.3261	-0.2853	0.3032

Sum of Residuals	0
Sum of Squared Residuals	218.53997
Predicted Residual SS (PRESS)	250.97516

Obs	RunTime	MODEL1
1	9	52.6272
2	10	49.3164
3	11	46.0055
4	12	42.6947
5	13	39.3838
6	14	36.0730
7	15	32.7621

Number of Observations Read	252
Number of Observations Used	252

Root MSE	6.62902	R-Square	0.3751
Dependent Mean	19.15079	Adj R-Sq	0.3726
Coeff Var	34.61485

PID	Lot size in square feet	Style of dwelling	Overall material and finish of the house	Overall condition of the house	Original construction year	Heating quality and condition	Presence of central air conditioning	Above grade (ground) living area square feet	Bedrooms above grade	Number of fireplaces	Size of garage in square feet	Month Sold (MM)	Year Sold (YYYY)	Sale price in dollars	Basement area in square feet	Number of full bathrooms	Number of half bathrooms	Total number of bathrooms (half bathrooms counted 10%)	Total area of decks and porches in square feet	Age of house when sold, in years	Season when house sold	Garage attached or detached	Foundation Type	Masonry veneer or not	Regular or irregular lot shape	Style of dwelling	Overall material and finish of the house	Overall condition of the house	Natural log of the sale price	Sale Price > $175,000	score
0527127150	4920	1Story	8	5	2001	Ex	Y	1338	2	0	582	4	2010	213500	1338	3	0	3	0	9	2	Attached	Concrete/Slab	N	Regular	1Story	6	5	12.271392112	1	.
0527145080	5005	1Story	8	5	1992	Ex	Y	1280	2	0	506	1	2010	191500	1280	2	0	2	226	18	1	Attached	Concrete/Slab	N	Irregular	1Story	6	5	12.162643088	1	.
0527425090	10500	1Story	4	5	1971	TA	Y	864	3	1	0	4	2010	115000	864	1	0	1	0	39	2	NA	Cinder Block	N	Regular	1Story	4	5	11.652687407	0	.
0528228285	3203	1Story	7	5	2006	Ex	Y	1145	2	0	437	1	2010	160000	1145	2	0	2	216	4	1	Attached	Concrete/Slab	Y	Regular	1Story	6	5	11.982929094	0	.
0528250100	7750	SLvl	7	5	2000	Ex	Y	1430	3	1	400	4	2010	180000	384	2	1	2.1	180	10	2	Attached	Concrete/Slab	N	Irregular	SLvl	6	5	12.10071213	1	.
0531452050	7175	1Story	6	5	1984	TA	Y	752	2	0	264	2	2010	125000	744	2	0	2	443	26	1	Attached	Cinder Block	N	Regular	1Story	6	5	11.736069016	0	.
0533253210	3880	1Story	8	6	1978	TA	Y	1226	1	1	484	1	2010	206000	1226	2	0	2	301	32	1	Attached	Cinder Block	N	Irregular	1Story	6	6	12.235631448	1	.
0534401110	9900	1Story	5	5	1966	Gd	Y	1209	3	0	504	4	2010	159000	1209	2	0	2	0	44	2	Attached	Concrete/Slab	N	Regular	1Story	5	5	11.976659481	0	.
0534403410	14112	SLvl	5	7	1964	TA	Y	1152	3	1	484	4	2010	180500	1152	2	0	2	227	46	2	Attached	Concrete/Slab	Y	Irregular	SLvl	5	6	12.103486057	1	.
0534430080	9717	1Story	5	6	1950	Gd	Y	1078	2	0	240	4	2010	142125	1078	2	0	2	366	60	2	Attached	Cinder Block	N	Regular	1Story	5	6	11.864462231	0	.
0534475100	9920	1Story	5	5	1954	TA	Y	1063	3	0	280	2	2010	128000	1056	2	0	2	0	56	1	Attached	Cinder Block	Y	Regular	1Story	5	5	11.759785543	0	.
0534479320	7800	1Story	5	6	1954	Gd	Y	1268	2	1	244	3	2010	132000	1268	1	0	1	98	56	2	Attached	Concrete/Slab	Y	Regular	1Story	5	6	11.790557202	0	.
0535101020	11380	SFoyer	6	8	1966	Gd	Y	1128	2	1	315	1	2010	178000	1080	2	0	2	238	44	1	Attached	Cinder Block	Y	Irregular	SFoyer	6	6	12.089538829	1	.
0535302080	10950	1Story	5	7	1952	TA	Y	1064	2	0	318	5	2010	135000	864	1	1	1.1	0	58	2	Detached	Cinder Block	N	Regular	1Story	5	6	11.813030057	0	.
0535453070	7500	1Story	5	7	1959	Ex	Y	1246	3	0	305	5	2010	154000	1246	2	1	2.1	218	51	2	Attached	Cinder Block	N	Regular	1Story	5	6	11.944707881	0	.
0535456110	7200	1Story	5	7	1951	TA	Y	900	3	0	576	5	2010	134800	900	1	1	1.1	254	59	2	Detached	Cinder Block	N	Regular	1Story	5	6	11.811547477	0	.
0535476350	9760	1Story	6	7	1963	TA	Y	1395	2	1	440	5	2010	192000	1395	2	0	2	897	47	2	Attached	Cinder Block	Y	Regular	1Story	6	6	12.165250651	1	.
0902125160	4608	1Story	4	6	1945	TA	Y	747	2	0	220	6	2010	80000	747	1	0	1	0	65	3	Attached	Cinder Block	N	Regular	1Story	4	6	11.289781914	0	.
0902206130	6900	1.5Fin	6	7	1938	Gd	Y	1251	3	0	240	1	2010	119000	827	1	0	1	0	72	1	Detached	Concrete/Slab	N	Regular	1.5Fin	6	6	11.686878772	0	.
0903232190	6240	1.5Fin	5	7	1936	Gd	Y	1040	2	0	624	5	2010	123900	528	1	0	1	306	74	2	Detached	Cinder Block	N	Regular	1.5Fin	5	6	11.727230068	0	.

Moments
N	300	Sum Weights	300
Mean	137524.867	Sum Observations	41257460
Std Deviation	37622.6431	Variance	1415463276
Skewness	0.29726388	Kurtosis	0.72287774
Uncorrected SS	6.09715E12	Corrected SS	4.23224E11
Coeff Variation	27.3569748	Std Error Mean	2172.14431

Basic Statistical Measures
Location		Variability
Mean	137524.9	Std Deviation	37623
Median	135000.0	Variance	1415463276
Mode	110000.0	Range	255000
		Interquartile Range	45475

Tests for Location: Mu0=0
Test	Statistic		p Value
Student's t	t	63.31295	Pr > \|t\|	<.0001
Sign	M	150	Pr >= \|M\|	<.0001
Signed Rank	S	22575	Pr >= \|S\|	<.0001

Quantiles (Definition 5)
Level	Quantile
100% Max	290000
99%	227500
95%	207000
90%	187300
75% Q3	159475
50% Median	135000
25% Q1	114000
10%	91150
5%	80000
1%	48500
0% Min	35000

Extreme Observations
Lowest		Highest
Value	Obs	Value	Obs
35000	294	218000	184
39300	190	220000	106
45000	77	235000	151
52000	130	245000	54
59000	70	290000	123

Moments
N	300	Sum Weights	300
Mean	882.31	Sum Observations	264693
Std Deviation	359.783966	Variance	129444.502
Skewness	-0.5476589	Kurtosis	0.13741949
Uncorrected SS	272245187	Corrected SS	38703906.2
Coeff Variation	40.7775007	Std Error Mean	20.772137

Moments
N	300	Sum Weights	300
Mean	8294.13667	Sum Observations	2488241
Std Deviation	3323.78787	Variance	11047565.8
Skewness	1.00934511	Kurtosis	4.57577642
Uncorrected SS	2.3941E10	Corrected SS	3303222171
Coeff Variation	40.0739462	Std Error Mean	191.898982

Root MSE	27217	R-Square	0.4802
Dependent Mean	137525	Adj R-Sq	0.4767
Coeff Var	19.79041

Parameter Estimates
Variable	Label	DF	Parameter Estimate	Standard Error	t Value	Pr > \|t\|
Intercept	Intercept	1	69016	5129.52179	13.45	<.0001
Basement_Area	Basement area in square feet	1	70.08680	4.54618	15.42	<.0001
Lot_Area	Lot size in square feet	1	0.80430	0.49210	1.63	0.1032

Source	DF	Sum of Squares	Mean Square	F Value	Pr > F
Model	2	203220618262	101610309131	137.17	<.0001
Error	297	220002901249	740750509.26
Corrected Total	299	423223519511

Source	DF	Type I SS	Mean Square	F Value	Pr > F
Basement_Area	1	201241844480	201241844480	271.67	<.0001
Lot_Area	1	1978773781.7	1978773781.7	2.67	0.1032

Source	DF	Type III SS	Mean Square	F Value	Pr > F
Basement_Area	1	176055907089	176055907089	237.67	<.0001
Lot_Area	1	1978773781.7	1978773781.7	2.67	0.1032

Contrast	DF	Contrast SS	Mean Square	F Value	Pr > F
Basement_Area=0	1	176055907089	176055907089	237.67	<.0001
Basement_Area=Lot_Area	1	160693644810	160693644810	216.93	<.0001
Basement_Area=Lot_Area=0	2	203220618262	101610309131	137.17	<.0001

Parameter	Estimate	Standard Error	t Value	Pr > \|t\|
Basement_Area=0	70.0868031	4.54618316	15.42	<.0001
Basement_Area=Lot_Area	69.2825041	4.70392301	14.73	<.0001

Parameter	Estimate	Standard Error	t Value	Pr > \|t\|
Intercept	69015.61360	5129.521790	13.45	<.0001
Basement_Area	70.08680	4.546183	15.42	<.0001
Lot_Area	0.80430	0.492102	1.63	0.1032

Store Information
Item Store	WORK.MULTIPLE
Data Set Created From	STATDATA.AMESHOUSING3
Created By	PROC GLM
Date Created	03JUL17:23:17:52
Response Variable	SalePrice
Model Effects	Intercept Basement_Area Lot_Area

Model Index	Number in Model	C(p)	R-Square	Adjusted R-Square	Variables in Model
1	4	4.0004	0.8355	0.8102	RunTime Age Run_Pulse Maximum_Pulse
2	5	4.2598	0.8469	0.8163	RunTime Age Weight Run_Pulse Maximum_Pulse
3	5	4.7158	0.8439	0.8127	Performance RunTime Weight Run_Pulse Maximum_Pulse
4	5	4.7168	0.8439	0.8127	Performance RunTime Age Run_Pulse Maximum_Pulse
5	4	4.9567	0.8292	0.8029	Performance RunTime Run_Pulse Maximum_Pulse
6	3	5.8570	0.8101	0.7890	RunTime Run_Pulse Maximum_Pulse
7	3	5.9367	0.8096	0.7884	RunTime Age Run_Pulse
8	5	5.9783	0.8356	0.8027	RunTime Age Run_Pulse Rest_Pulse Maximum_Pulse
9	5	5.9856	0.8356	0.8027	Performance Age Weight Run_Pulse Maximum_Pulse
10	6	6.0492	0.8483	0.8104	Performance RunTime Age Weight Run_Pulse Maximum_Pulse
11	6	6.1758	0.8475	0.8094	RunTime Age Weight Run_Pulse Rest_Pulse Maximum_Pulse
12	6	6.6171	0.8446	0.8057	Performance RunTime Weight Run_Pulse Rest_Pulse Maximum_Pulse
13	6	6.7111	0.8440	0.8049	Performance RunTime Age Run_Pulse Rest_Pulse Maximum_Pulse
14	4	6.8865	0.8165	0.7882	Performance RunTime Age Run_Pulse
15	5	6.9446	0.8293	0.7951	Performance RunTime Run_Pulse Rest_Pulse Maximum_Pulse
16	4	6.9623	0.8160	0.7877	RunTime Weight Run_Pulse Maximum_Pulse
17	4	7.0752	0.8152	0.7868	RunTime Age Weight Run_Pulse
18	3	7.1734	0.8014	0.7794	Performance RunTime Run_Pulse
19	6	7.7279	0.8373	0.7966	Performance Age Weight Run_Pulse Rest_Pulse Maximum_Pulse
20	4	7.7942	0.8105	0.7814	RunTime Run_Pulse Rest_Pulse Maximum_Pulse