notebook.community

Edit and run



In [1]:

    
# This is a demo of H2O's GLM function
# It imports a data set, parses it, and prints a summary
# Then, it runs GLM with a binomial link function
import h2o



In [2]:

    
h2o.init()









    




H2O cluster uptime: 
5 minutes 14 seconds 128 milliseconds 
H2O cluster version: 
3.1.0.99999
H2O cluster name: 
ece
H2O cluster total nodes: 
1
H2O cluster total memory: 
4.44 GB
H2O cluster total cores: 
8
H2O cluster allowed cores: 
8
H2O cluster healthy: 
True
H2O Connection ip: 
127.0.0.1
H2O Connection port: 
54321



In [3]:

    
air = h2o.upload_file(path=h2o.locate("smalldata/airlines/AirlinesTrain.csv.zip"))









    



Parse Progress: [##################################################] 100%
Uploaded py6f514d4e-23da-4051-9994-ddb299009665 into cluster with 24421 rows and 12 cols



In [4]:

    
r = air[0].runif()
air_train = air[r < 0.8]
air_valid = air[r >= 0.8]



In [5]:

    
myX = ["Origin", "Dest", "Distance", "UniqueCarrier", "fMonth", "fDayofMonth", "fDayOfWeek"]
myY = "IsDepDelayed"



In [6]:

    
rf_no_bal = h2o.random_forest(x=air_train[myX], y=air_train[myY], validation_x= air_valid[myX],
                              validation_y=air_valid[myY], seed=12, ntrees=10, max_depth=20, balance_classes=False)
rf_no_bal.show()









    



drf Model Build Progress: [##################################################] 100%
Model Details
=============
H2OBinomialModel :  Distributed RF
Model Key:  DRFModel__81f49ff2c23a04a37e910bcc58fb4215

Model Summary:







    





number_of_trees
model_size_in_bytes
min_depth
max_depth
mean_depth
min_leaves
max_leaves
mean_leaves

10.0
149878.0
20.0
20.0
20.0
879.0
1140.0
1023.3






    



ModelMetricsBinomial: drf
** Reported on train data. **

MSE: 0.228111120262
R^2: 0.080357095838
LogLoss: 0.841536443761
AUC: 0.686450906072
Gini: 0.372901812144

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.286623619698:







    





NO
YES
Error
Rate
NO
1569.0
7236.0
0.8218
 (7236.0/8805.0)
YES
651.0
9877.0
0.0618
 (651.0/10528.0)
Total
2220.0
17113.0
0.8836
 (0.8836/19333.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.286623619698
0.714663000615
325.0
f2
8.00260088661e-05
0.856701114818
399.0
f0point5
0.618131936925
0.67159241288
173.0
accuracy
0.510547936844
0.641028293591
224.0
precision
0.932283611338
0.819430814524
23.0
absolute_MCC
0.620255349514
0.282106612071
172.0
min_per_class_accuracy
0.574981335833
0.636493161094
194.0
tns
1.0
8708.0
0.0
fns
1.0
10192.0
0.0
fps
8.00260088661e-05
8805.0
399.0
tps
8.00260088661e-05
10528.0
399.0
tnr
1.0
0.988983532084
0.0
fnr
1.0
0.968085106383
0.0
fpr
8.00260088661e-05
1.0
399.0
tpr
8.00260088661e-05
1.0
399.0






    



ModelMetricsBinomial: drf
** Reported on validation data. **

MSE: 0.215203503359
R^2: 0.128107873875
LogLoss: 0.626040757587
AUC: 0.710222541095
Gini: 0.42044508219

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.351204755018:







    





NO
YES
Error
Rate
NO
557.0
1608.0
0.7427
 (1608.0/2165.0)
YES
216.0
2504.0
0.0794
 (216.0/2720.0)
Total
773.0
4112.0
0.8221
 (0.8221/4885.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.351204755018
0.733021077283
322.0
f2
0.1770355165
0.863265826009
385.0
f0point5
0.642643034239
0.696284032377
169.0
accuracy
0.515855323573
0.659160696008
238.0
precision
0.957237901508
0.95
10.0
absolute_MCC
0.642643034239
0.316924653321
169.0
min_per_class_accuracy
0.574910362384
0.655889145497
205.0
tns
0.998260494554
2164.0
0.0
fns
0.998260494554
2717.0
0.0
fps
0.104244194428
2165.0
399.0
tps
0.104244194428
2720.0
399.0
tnr
0.998260494554
0.999538106236
0.0
fnr
0.998260494554
0.998897058824
0.0
fpr
0.104244194428
1.0
399.0
tpr
0.104244194428
1.0
399.0






    



Scoring History:







    





timestamp
duration
number_of_trees
training_MSE
training_logloss
training_AUC
training_classification_error
validation_MSE
validation_logloss
validation_AUC
validation_classification_error

2015-05-22 13:26:25
 0.334 sec
1.0
0.257064819084
1.92745589155
0.645868499693
0.428229328074
0.400629273877
2.39647969232
0.655043642168
0.411463664278

2015-05-22 13:26:25
 0.440 sec
2.0
0.253175703449
1.99065654625
0.653143456319
0.420079146593
0.363333128699
1.40768739382
0.67876443418
0.386489252815

2015-05-22 13:26:25
 0.552 sec
3.0
0.250745679307
1.78042186773
0.653824627008
0.416884366836
0.329983010029
1.05156077229
0.692085654123
0.38792221085

2015-05-22 13:26:25
 0.636 sec
4.0
0.244652116515
1.55450013636
0.663639012346
0.417550274223
0.300502917089
0.87597034314
0.700898315446
0.373797338792

2015-05-22 13:26:26
 0.727 sec
5.0
0.240277797376
1.37589402537
0.669836126819
0.403455748175
0.276923048874
0.785013264077
0.698065904768
0.37011258956

2015-05-22 13:26:26
 0.823 sec
6.0
0.23645204618
1.17572966227
0.675103136318
0.404289161913
0.255862667171
0.726255343312
0.701924840375
0.378915046059

2015-05-22 13:26:26
 0.923 sec
7.0
0.23381891121
1.05318282497
0.677629857572
0.407652843095
0.239395686254
0.684070706264
0.704263771906
0.362128966223

2015-05-22 13:26:26
 1.023 sec
8.0
0.231583272153
0.975679020094
0.680780516942
0.391148954063
0.226867426377
0.653676735705
0.707648332428
0.367656090072

2015-05-22 13:26:26
 1.129 sec
9.0
0.22954871378
0.892928560467
0.684517249362
0.407442102524
0.219018616188
0.635320181722
0.708369107458
0.381166837257

2015-05-22 13:26:26
 1.242 sec
10.0
0.228111120262
0.841536443761
0.686450906072
0.407955309574
0.215203503359
0.626040757587
0.710222541095
0.373387922211






    



Variable Importances:







    




variable
relative_importance
scaled_importance
percentage
Origin
5074.82666016
1.0
0.332755556342
fDayofMonth
3977.50952148
0.783772488766
0.260804650545
Dest
2911.67407227
0.573748477978
0.19091799399
UniqueCarrier
1222.79492188
0.240953042096
0.080178463575
fDayOfWeek
1001.97784424
0.197440801694
0.0656995238122
Distance
992.55279541
0.195583585781
0.0650815248979
fMonth
69.5790481567
0.0137106255674
0.00456228683847



In [7]:

    
rf_bal = h2o.random_forest(x=air_train[myX], y=air_train[myY], validation_x= air_valid[myX],
                               validation_y=air_valid[myY], seed=12, ntrees=10, max_depth=20, balance_classes=True)
rf_bal.show()









    



drf Model Build Progress: [##################################################] 100%
Model Details
=============
H2OBinomialModel :  Distributed RF
Model Key:  DRFModel__924d4015f4c523d250c26449b16945f4

Model Summary:







    





number_of_trees
model_size_in_bytes
min_depth
max_depth
mean_depth
min_leaves
max_leaves
mean_leaves

10.0
161279.0
20.0
20.0
20.0
1027.0
1201.0
1095.7






    



ModelMetricsBinomial: drf
** Reported on train data. **

MSE: 0.227050947395
R^2: 0.084631242813
LogLoss: 0.78142590824
AUC: 0.704638202454
Gini: 0.409276404907

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.401939962958:







    





NO
YES
Error
Rate
NO
3873.0
6648.0
0.6319
 (6648.0/10521.0)
YES
1460.0
9070.0
0.1387
 (1460.0/10530.0)
Total
5333.0
15718.0
0.7706
 (0.7706/21051.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.401939962958
0.691100274307
264.0
f2
0.0
0.833452058698
399.0
f0point5
0.605764139792
0.653672952435
170.0
accuracy
0.586745397134
0.652368058525
180.0
precision
0.978060912989
0.836501901141
5.0
absolute_MCC
0.586745397134
0.304762140786
180.0
min_per_class_accuracy
0.582231845092
0.650603554795
182.0
tns
1.0
10452.0
0.0
fns
1.0
10199.0
0.0
fps
0.0
10521.0
399.0
tps
0.0
10530.0
399.0
tnr
1.0
0.993441688052
0.0
fnr
1.0
0.968566001899
0.0
fpr
0.0
1.0
399.0
tpr
0.0
1.0
399.0






    



ModelMetricsBinomial: drf
** Reported on validation data. **

MSE: 0.216562495458
R^2: 0.122601948124
LogLoss: 0.623880060288
AUC: 0.708433551827
Gini: 0.416867103654

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.405045186133:







    





NO
YES
Error
Rate
NO
731.0
1434.0
0.6624
 (1434.0/2165.0)
YES
295.0
2425.0
0.1085
 (295.0/2720.0)
Total
1026.0
3859.0
0.7709
 (0.7709/4885.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.405045186133
0.737194102447
289.0
f2
0.135270716497
0.863416804373
386.0
f0point5
0.600421656558
0.690548294549
192.0
accuracy
0.505934494591
0.657932446264
243.0
precision
0.996011784004
1.0
0.0
absolute_MCC
0.620419824493
0.301529439421
180.0
min_per_class_accuracy
0.591067527012
0.648161764706
197.0
tns
0.996011784004
2165.0
0.0
fns
0.996011784004
2719.0
0.0
fps
0.0505929587793
2165.0
399.0
tps
0.0684239245551
2720.0
398.0
tnr
0.996011784004
1.0
0.0
fnr
0.996011784004
0.999632352941
0.0
fpr
0.0505929587793
1.0
399.0
tpr
0.0684239245551
1.0
398.0






    



Scoring History:







    





timestamp
duration
number_of_trees
training_MSE
training_logloss
training_AUC
training_classification_error
validation_MSE
validation_logloss
validation_AUC
validation_classification_error

2015-05-22 13:26:27
 0.104 sec
1.0
0.255954985197
2.08023432225
0.662981914014
0.461290738117
0.405418554538
2.54355159204
0.645635358647
0.412282497441

2015-05-22 13:26:27
 0.155 sec
2.0
0.25468472297
1.86352558991
0.661525255155
0.423780968913
0.370926635584
1.420635165
0.676132573699
0.414738996929

2015-05-22 13:26:27
 0.233 sec
3.0
0.249598393724
1.57169101175
0.668207041164
0.432992295061
0.338066318176
1.03915856891
0.691084601277
0.367860798362

2015-05-22 13:26:27
 0.328 sec
4.0
0.244778275953
1.39647444431
0.674642688382
0.430043687689
0.309120172949
0.890395753219
0.697113758321
0.380757420676

2015-05-22 13:26:27
 0.424 sec
5.0
0.240958193613
1.29019435107
0.681163088793
0.407940914567
0.284701124531
0.812070685409
0.696648128651
0.371136131013

2015-05-22 13:26:27
 0.525 sec
6.0
0.236190410816
1.10655931453
0.688713356476
0.403375314861
0.262138591924
0.736384904024
0.6989946169
0.367656090072

2015-05-22 13:26:27
 0.632 sec
7.0
0.233016124122
0.970814521363
0.693609936492
0.394177426481
0.244223116357
0.689816810281
0.702110956392
0.368884339816

2015-05-22 13:26:27
 0.745 sec
8.0
0.230513359481
0.901192589552
0.698361749513
0.385131547188
0.230494394698
0.657428244182
0.705417827062
0.362128966223

2015-05-22 13:26:27
 0.866 sec
9.0
0.22875793945
0.84306688264
0.701440488461
0.382718409482
0.221628858232
0.636504280496
0.705721114658
0.372569089048

2015-05-22 13:26:27
 0.989 sec
10.0
0.227050947395
0.78142590824
0.704638202454
0.385159849888
0.216562495458
0.623880060288
0.708433551827
0.353940634596






    



Variable Importances:







    




variable
relative_importance
scaled_importance
percentage
Origin
5626.28417969
1.0
0.327272142551
fDayofMonth
4230.62353516
0.751939184023
0.246088747823
Dest
3676.58178711
0.653465354698
0.213861006715
UniqueCarrier
1316.92211914
0.234066050893
0.0766032979742
fDayOfWeek
1163.80688477
0.206851777763
0.0676968244989
Distance
1089.97692871
0.193729448051
0.063402251539
fMonth
87.2591629028
0.0155091993429
0.00507572889822



In [8]:

    
air_test = h2o.import_frame(path=h2o.locate("smalldata/airlines/AirlinesTest.csv.zip"))









    



Parse Progress: [##################################################] 100%
Imported  /Users/ece/0xdata/h2o-dev/smalldata/airlines/AirlinesTest.csv.zip . Parsed 2,691 rows and 12 cols



In [9]:

    
def model(model_object, test):
        #predicting on test file
        pred = model_object.predict(test)
        pred.head()
        #Building confusion matrix for test set
        perf = model_object.model_performance(test)
        perf.show()
        print(perf.confusion_matrix())
        print(perf.precision())
        print(perf.accuracy())
        print(perf.auc())



In [10]:

    
print("\n\nWITHOUT CLASS BALANCING\n")
model(rf_no_bal, air_test)









    



WITHOUT CLASS BALANCING

First 10 rows and first 3 columns: 






    




Row ID
predict
NO
YES
1
YES
0.2999211110174656
0.7000788889825345
2
YES
0.3735275126993656
0.6264724873006344
3
YES
0.22238414585590363
0.7776158541440964
4
YES
0.3962472975254059
0.6037527024745941
5
YES
0.6098413661122322
0.39015863388776784
6
YES
0.4950307622551918
0.5049692377448082
7
NO
0.6746981769800187
0.32530182301998134
8
YES
0.48598509430885317
0.5140149056911468
9
NO
0.6735334724187851
0.32646652758121486
10
NO
0.7184682190418243
0.28153178095817566






    



ModelMetricsBinomial: drf
** Reported on test data. **

MSE: 0.209662417539
R^2: 0.153945177679
LogLoss: 0.618837191629
AUC: 0.731046158615
Gini: 0.462092317229

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.403470018009:







    





NO
YES
Error
Rate
NO
408.0
809.0
0.6647
 (809.0/1217.0)
YES
129.0
1345.0
0.0875
 (129.0/1474.0)
Total
537.0
2154.0
0.7522
 (0.7522/2691.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.403470018009
0.741455347299
293.0
f2
0.131483560801
0.858774178513
397.0
f0point5
0.577017590124
0.706999149901
203.0
accuracy
0.545709063964
0.678558156819
219.0
precision
0.949203286087
0.970588235294
13.0
absolute_MCC
0.545709063964
0.348789541022
219.0
min_per_class_accuracy
0.579830584209
0.672998643148
201.0
tns
1.0
1216.0
0.0
fns
1.0
1474.0
0.0
fps
0.119230582317
1217.0
399.0
tps
0.131483560801
1474.0
397.0
tnr
1.0
0.999178307313
0.0
fnr
1.0
1.0
0.0
fpr
0.119230582317
1.0
399.0
tpr
0.131483560801
1.0
397.0






    



Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.403470018009:







    





NO
YES
Error
Rate
NO
408.0
809.0
0.6647
 (809.0/1217.0)
YES
129.0
1345.0
0.0875
 (129.0/1474.0)
Total
537.0
2154.0
0.7522
 (0.7522/2691.0)






    



[[0.9492032860871406, 0.9705882352941176]]
[[0.5457090639642307, 0.6785581568190264]]
0.731046158615



In [11]:

    
print("\n\nWITH CLASS BALANCING\n")
model(rf_bal, air_test)









    



WITH CLASS BALANCING

First 10 rows and first 3 columns: 






    




Row ID
predict
NO
YES
1
YES
0.25423263730284795
0.7457673626971522
2
YES
0.3061814057479045
0.6938185942520956
3
YES
0.29582113197078996
0.7041788680292099
4
YES
0.24460687396796132
0.7553931260320388
5
YES
0.5550336349109918
0.44496636508900816
6
YES
0.5633564660627113
0.4366435339372887
7
NO
0.6514019680463551
0.3485980319536449
8
YES
0.41344391039693884
0.5865560896030612
9
NO
0.7010735205237005
0.2989264794762995
10
NO
0.5986760058318191
0.40132399416818093






    



ModelMetricsBinomial: drf
** Reported on test data. **

MSE: 0.215031073082
R^2: 0.13228093778
LogLoss: 0.623177900509
AUC: 0.716619152687
Gini: 0.433238305373

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.431652753871:







    





NO
YES
Error
Rate
NO
456.0
761.0
0.6253
 (761.0/1217.0)
YES
172.0
1302.0
0.1167
 (172.0/1474.0)
Total
628.0
2063.0
0.742
 (0.742/2691.0)






    



Maximum Metrics:







    




metric
threshold
value
idx
f1
0.431652753871
0.736217133164
279.0
f2
0.168528514213
0.859950859951
380.0
f0point5
0.604570530568
0.697193500739
189.0
accuracy
0.566351616261
0.669267930137
209.0
precision
0.997537422127
1.0
0.0
absolute_MCC
0.566351616261
0.330678623344
209.0
min_per_class_accuracy
0.593757553889
0.661462612983
195.0
tns
0.997537422127
1217.0
0.0
fns
0.997537422127
1473.0
0.0
fps
0.0562118133373
1217.0
399.0
tps
0.115491940237
1474.0
394.0
tnr
0.997537422127
1.0
0.0
fnr
0.997537422127
0.999321573948
0.0
fpr
0.0562118133373
1.0
399.0
tpr
0.115491940237
1.0
394.0






    



Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.431652753871:







    





NO
YES
Error
Rate
NO
456.0
761.0
0.6253
 (761.0/1217.0)
YES
172.0
1302.0
0.1167
 (172.0/1474.0)
Total
628.0
2063.0
0.742
 (0.742/2691.0)






    



[[0.997537422127463, 1.0]]
[[0.5663516162614709, 0.6692679301374953]]
0.716619152687

H2O cluster uptime:	5 minutes 14 seconds 128 milliseconds
H2O cluster version:	3.1.0.99999
H2O cluster name:	ece
H2O cluster total nodes:	1
H2O cluster total memory:	4.44 GB
H2O cluster total cores:	8
H2O cluster allowed cores:	8
H2O cluster healthy:	True
H2O Connection ip:	127.0.0.1
H2O Connection port:	54321

	number_of_trees	model_size_in_bytes	min_depth	max_depth	mean_depth	min_leaves	max_leaves	mean_leaves
	10.0	149878.0	20.0	20.0	20.0	879.0	1140.0	1023.3

	NO	YES	Error	Rate
NO	1569.0	7236.0	0.8218	(7236.0/8805.0)
YES	651.0	9877.0	0.0618	(651.0/10528.0)
Total	2220.0	17113.0	0.8836	(0.8836/19333.0)

metric	threshold	value	idx
f1	0.286623619698	0.714663000615	325.0
f2	8.00260088661e-05	0.856701114818	399.0
f0point5	0.618131936925	0.67159241288	173.0
accuracy	0.510547936844	0.641028293591	224.0
precision	0.932283611338	0.819430814524	23.0
absolute_MCC	0.620255349514	0.282106612071	172.0
min_per_class_accuracy	0.574981335833	0.636493161094	194.0
tns	1.0	8708.0	0.0
fns	1.0	10192.0	0.0
fps	8.00260088661e-05	8805.0	399.0
tps	8.00260088661e-05	10528.0	399.0
tnr	1.0	0.988983532084	0.0
fnr	1.0	0.968085106383	0.0
fpr	8.00260088661e-05	1.0	399.0
tpr	8.00260088661e-05	1.0	399.0

	NO	YES	Error	Rate
NO	557.0	1608.0	0.7427	(1608.0/2165.0)
YES	216.0	2504.0	0.0794	(216.0/2720.0)
Total	773.0	4112.0	0.8221	(0.8221/4885.0)

timestamp	duration	number_of_trees	training_MSE	training_logloss	training_AUC	training_classification_error	validation_MSE	validation_logloss	validation_AUC	validation_classification_error
2015-05-22 13:26:25	0.334 sec	1.0	0.257064819084	1.92745589155	0.645868499693	0.428229328074	0.400629273877	2.39647969232	0.655043642168	0.411463664278
2015-05-22 13:26:25	0.440 sec	2.0	0.253175703449	1.99065654625	0.653143456319	0.420079146593	0.363333128699	1.40768739382	0.67876443418	0.386489252815
2015-05-22 13:26:25	0.552 sec	3.0	0.250745679307	1.78042186773	0.653824627008	0.416884366836	0.329983010029	1.05156077229	0.692085654123	0.38792221085
2015-05-22 13:26:25	0.636 sec	4.0	0.244652116515	1.55450013636	0.663639012346	0.417550274223	0.300502917089	0.87597034314	0.700898315446	0.373797338792
2015-05-22 13:26:26	0.727 sec	5.0	0.240277797376	1.37589402537	0.669836126819	0.403455748175	0.276923048874	0.785013264077	0.698065904768	0.37011258956
2015-05-22 13:26:26	0.823 sec	6.0	0.23645204618	1.17572966227	0.675103136318	0.404289161913	0.255862667171	0.726255343312	0.701924840375	0.378915046059
2015-05-22 13:26:26	0.923 sec	7.0	0.23381891121	1.05318282497	0.677629857572	0.407652843095	0.239395686254	0.684070706264	0.704263771906	0.362128966223
2015-05-22 13:26:26	1.023 sec	8.0	0.231583272153	0.975679020094	0.680780516942	0.391148954063	0.226867426377	0.653676735705	0.707648332428	0.367656090072
2015-05-22 13:26:26	1.129 sec	9.0	0.22954871378	0.892928560467	0.684517249362	0.407442102524	0.219018616188	0.635320181722	0.708369107458	0.381166837257
2015-05-22 13:26:26	1.242 sec	10.0	0.228111120262	0.841536443761	0.686450906072	0.407955309574	0.215203503359	0.626040757587	0.710222541095	0.373387922211

variable	relative_importance	scaled_importance	percentage
Origin	5074.82666016	1.0	0.332755556342
fDayofMonth	3977.50952148	0.783772488766	0.260804650545
Dest	2911.67407227	0.573748477978	0.19091799399
UniqueCarrier	1222.79492188	0.240953042096	0.080178463575
fDayOfWeek	1001.97784424	0.197440801694	0.0656995238122
Distance	992.55279541	0.195583585781	0.0650815248979
fMonth	69.5790481567	0.0137106255674	0.00456228683847

	NO	YES	Error	Rate
NO	3873.0	6648.0	0.6319	(6648.0/10521.0)
YES	1460.0	9070.0	0.1387	(1460.0/10530.0)
Total	5333.0	15718.0	0.7706	(0.7706/21051.0)

	NO	YES	Error	Rate
NO	731.0	1434.0	0.6624	(1434.0/2165.0)
YES	295.0	2425.0	0.1085	(295.0/2720.0)
Total	1026.0	3859.0	0.7709	(0.7709/4885.0)

Row ID	predict	NO	YES
1	YES	0.2999211110174656	0.7000788889825345
2	YES	0.3735275126993656	0.6264724873006344
3	YES	0.22238414585590363	0.7776158541440964
4	YES	0.3962472975254059	0.6037527024745941
5	YES	0.6098413661122322	0.39015863388776784
6	YES	0.4950307622551918	0.5049692377448082
7	NO	0.6746981769800187	0.32530182301998134
8	YES	0.48598509430885317	0.5140149056911468
9	NO	0.6735334724187851	0.32646652758121486
10	NO	0.7184682190418243	0.28153178095817566

	NO	YES	Error	Rate
NO	408.0	809.0	0.6647	(809.0/1217.0)
YES	129.0	1345.0	0.0875	(129.0/1474.0)
Total	537.0	2154.0	0.7522	(0.7522/2691.0)

Row ID	predict	NO	YES
1	YES	0.25423263730284795	0.7457673626971522
2	YES	0.3061814057479045	0.6938185942520956
3	YES	0.29582113197078996	0.7041788680292099
4	YES	0.24460687396796132	0.7553931260320388
5	YES	0.5550336349109918	0.44496636508900816
6	YES	0.5633564660627113	0.4366435339372887
7	NO	0.6514019680463551	0.3485980319536449
8	YES	0.41344391039693884	0.5865560896030612
9	NO	0.7010735205237005	0.2989264794762995
10	NO	0.5986760058318191	0.40132399416818093

	NO	YES	Error	Rate
NO	456.0	761.0	0.6253	(761.0/1217.0)
YES	172.0	1302.0	0.1167	(172.0/1474.0)
Total	628.0	2063.0	0.742	(0.742/2691.0)