In [2]:
library(iRF)
Loading required package: AUC
AUC 0.3.0
Type AUCNews() to see the change log and ?AUC to get an overview.
Loading required package: doMC
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Loading required package: akima
iRF 1.0.0
In [3]:
source('R_irf_benchmarks_utils.R')
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Attaching package: ‘randomForest’
The following objects are masked from ‘package:iRF’:
classCenter, combine, getTree, grow, importance, margin, MDSplot,
na.roughfix, outlier, partialPlot, randomForest, rfcv, rfImpute,
rfNews, treesize, tuneRF, varImpPlot, varUsed
In [31]:
colnames(features) <- paste('X',0:29,sep = '')
features
X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 ⋯ X20 X21 X22 X23 X24 X25 X26 X27 X28 X29
17.990 10.38 122.80 1001.0 0.11840 0.27760 0.30010 0.14710 0.2419 0.07871 ⋯ 25.38 17.33 184.60 2019.0 0.1622 0.6656 0.71190 0.26540 0.4601 0.11890
20.570 17.77 132.90 1326.0 0.08474 0.07864 0.08690 0.07017 0.1812 0.05667 ⋯ 24.99 23.41 158.80 1956.0 0.1238 0.1866 0.24160 0.18600 0.2750 0.08902
19.690 21.25 130.00 1203.0 0.10960 0.15990 0.19740 0.12790 0.2069 0.05999 ⋯ 23.57 25.53 152.50 1709.0 0.1444 0.4245 0.45040 0.24300 0.3613 0.08758
11.420 20.38 77.58 386.1 0.14250 0.28390 0.24140 0.10520 0.2597 0.09744 ⋯ 14.91 26.50 98.87 567.7 0.2098 0.8663 0.68690 0.25750 0.6638 0.17300
20.290 14.34 135.10 1297.0 0.10030 0.13280 0.19800 0.10430 0.1809 0.05883 ⋯ 22.54 16.67 152.20 1575.0 0.1374 0.2050 0.40000 0.16250 0.2364 0.07678
12.450 15.70 82.57 477.1 0.12780 0.17000 0.15780 0.08089 0.2087 0.07613 ⋯ 15.47 23.75 103.40 741.6 0.1791 0.5249 0.53550 0.17410 0.3985 0.12440
18.250 19.98 119.60 1040.0 0.09463 0.10900 0.11270 0.07400 0.1794 0.05742 ⋯ 22.88 27.66 153.20 1606.0 0.1442 0.2576 0.37840 0.19320 0.3063 0.08368
13.710 20.83 90.20 577.9 0.11890 0.16450 0.09366 0.05985 0.2196 0.07451 ⋯ 17.06 28.14 110.60 897.0 0.1654 0.3682 0.26780 0.15560 0.3196 0.11510
13.000 21.82 87.50 519.8 0.12730 0.19320 0.18590 0.09353 0.2350 0.07389 ⋯ 15.49 30.73 106.20 739.3 0.1703 0.5401 0.53900 0.20600 0.4378 0.10720
12.460 24.04 83.97 475.9 0.11860 0.23960 0.22730 0.08543 0.2030 0.08243 ⋯ 15.09 40.68 97.65 711.4 0.1853 1.0580 1.10500 0.22100 0.4366 0.20750
16.020 23.24 102.70 797.8 0.08206 0.06669 0.03299 0.03323 0.1528 0.05697 ⋯ 19.19 33.88 123.80 1150.0 0.1181 0.1551 0.14590 0.09975 0.2948 0.08452
15.780 17.89 103.60 781.0 0.09710 0.12920 0.09954 0.06606 0.1842 0.06082 ⋯ 20.42 27.28 136.50 1299.0 0.1396 0.5609 0.39650 0.18100 0.3792 0.10480
19.170 24.80 132.40 1123.0 0.09740 0.24580 0.20650 0.11180 0.2397 0.07800 ⋯ 20.96 29.94 151.70 1332.0 0.1037 0.3903 0.36390 0.17670 0.3176 0.10230
15.850 23.95 103.70 782.7 0.08401 0.10020 0.09938 0.05364 0.1847 0.05338 ⋯ 16.84 27.66 112.00 876.5 0.1131 0.1924 0.23220 0.11190 0.2809 0.06287
13.730 22.61 93.60 578.3 0.11310 0.22930 0.21280 0.08025 0.2069 0.07682 ⋯ 15.03 32.01 108.80 697.7 0.1651 0.7725 0.69430 0.22080 0.3596 0.14310
14.540 27.54 96.73 658.8 0.11390 0.15950 0.16390 0.07364 0.2303 0.07077 ⋯ 17.46 37.13 124.10 943.2 0.1678 0.6577 0.70260 0.17120 0.4218 0.13410
14.680 20.13 94.74 684.5 0.09867 0.07200 0.07395 0.05259 0.1586 0.05922 ⋯ 19.07 30.88 123.40 1138.0 0.1464 0.1871 0.29140 0.16090 0.3029 0.08216
16.130 20.68 108.10 798.8 0.11700 0.20220 0.17220 0.10280 0.2164 0.07356 ⋯ 20.96 31.48 136.80 1315.0 0.1789 0.4233 0.47840 0.20730 0.3706 0.11420
19.810 22.15 130.00 1260.0 0.09831 0.10270 0.14790 0.09498 0.1582 0.05395 ⋯ 27.32 30.88 186.80 2398.0 0.1512 0.3150 0.53720 0.23880 0.2768 0.07615
13.540 14.36 87.46 566.3 0.09779 0.08129 0.06664 0.04781 0.1885 0.05766 ⋯ 15.11 19.26 99.70 711.2 0.1440 0.1773 0.23900 0.12880 0.2977 0.07259
13.080 15.71 85.63 520.0 0.10750 0.12700 0.04568 0.03110 0.1967 0.06811 ⋯ 14.50 20.49 96.09 630.5 0.1312 0.2776 0.18900 0.07283 0.3184 0.08183
9.504 12.44 60.34 273.9 0.10240 0.06492 0.02956 0.02076 0.1815 0.06905 ⋯ 10.23 15.66 65.13 314.9 0.1324 0.1148 0.08867 0.06227 0.2450 0.07773
15.340 14.26 102.50 704.4 0.10730 0.21350 0.20770 0.09756 0.2521 0.07032 ⋯ 18.07 19.08 125.10 980.9 0.1390 0.5954 0.63050 0.23930 0.4667 0.09946
21.160 23.04 137.20 1404.0 0.09428 0.10220 0.10970 0.08632 0.1769 0.05278 ⋯ 29.17 35.59 188.00 2615.0 0.1401 0.2600 0.31550 0.20090 0.2822 0.07526
16.650 21.38 110.00 904.6 0.11210 0.14570 0.15250 0.09170 0.1995 0.06330 ⋯ 26.46 31.56 177.00 2215.0 0.1805 0.3578 0.46950 0.20950 0.3613 0.09564
17.140 16.40 116.00 912.7 0.11860 0.22760 0.22290 0.14010 0.3040 0.07413 ⋯ 22.25 21.40 152.40 1461.0 0.1545 0.3949 0.38530 0.25500 0.4066 0.10590
14.580 21.53 97.41 644.8 0.10540 0.18680 0.14250 0.08783 0.2252 0.06924 ⋯ 17.62 33.21 122.40 896.9 0.1525 0.6643 0.55390 0.27010 0.4264 0.12750
18.610 20.25 122.10 1094.0 0.09440 0.10660 0.14900 0.07731 0.1697 0.05699 ⋯ 21.31 27.26 139.90 1403.0 0.1338 0.2117 0.34460 0.14900 0.2341 0.07421
15.300 25.27 102.40 732.4 0.10820 0.16970 0.16830 0.08751 0.1926 0.06540 ⋯ 20.27 36.71 149.30 1269.0 0.1641 0.6110 0.63350 0.20240 0.4027 0.09876
17.570 15.05 115.00 955.1 0.09847 0.11570 0.09875 0.07953 0.1739 0.06149 ⋯ 20.01 19.52 134.90 1227.0 0.1255 0.2812 0.24890 0.14560 0.2756 0.07919
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
7.691 25.44 48.34 170.4 0.08668 0.11990 0.092520 0.013640 0.2037 0.07751 ⋯ 8.678 31.89 54.49 223.6 0.15960 0.30640 0.33930 0.05000 0.2790 0.10660
11.540 14.44 74.65 402.9 0.09984 0.11200 0.067370 0.025940 0.1818 0.06782 ⋯ 12.260 19.68 78.78 457.8 0.13450 0.21180 0.17970 0.06918 0.2329 0.08134
14.470 24.99 95.81 656.4 0.08837 0.12300 0.100900 0.038900 0.1872 0.06341 ⋯ 16.220 31.73 113.50 808.9 0.13400 0.42020 0.40400 0.12050 0.3187 0.10230
14.740 25.42 94.70 668.6 0.08275 0.07214 0.041050 0.030270 0.1840 0.05680 ⋯ 16.510 32.29 107.40 826.4 0.10600 0.13760 0.16110 0.10950 0.2722 0.06956
13.210 28.06 84.88 538.4 0.08671 0.06877 0.029870 0.032750 0.1628 0.05781 ⋯ 14.370 37.17 92.48 629.6 0.10720 0.13810 0.10620 0.07958 0.2473 0.06443
13.870 20.70 89.77 584.8 0.09578 0.10180 0.036880 0.023690 0.1620 0.06688 ⋯ 15.050 24.75 99.17 688.6 0.12640 0.20370 0.13770 0.06845 0.2249 0.08492
13.620 23.23 87.19 573.2 0.09246 0.06747 0.029740 0.024430 0.1664 0.05801 ⋯ 15.350 29.09 97.58 729.8 0.12160 0.15170 0.10490 0.07174 0.2642 0.06953
10.320 16.35 65.31 324.9 0.09434 0.04994 0.010120 0.005495 0.1885 0.06201 ⋯ 11.250 21.77 71.12 384.9 0.12850 0.08842 0.04384 0.02381 0.2681 0.07399
10.260 16.58 65.85 320.8 0.08877 0.08066 0.043580 0.024380 0.1669 0.06714 ⋯ 10.830 22.04 71.08 357.4 0.14610 0.22460 0.17830 0.08333 0.2691 0.09479
9.683 19.34 61.05 285.7 0.08491 0.05030 0.023370 0.009615 0.1580 0.06235 ⋯ 10.930 25.59 69.10 364.2 0.11990 0.09546 0.09350 0.03846 0.2552 0.07920
10.820 24.21 68.89 361.6 0.08192 0.06602 0.015480 0.008160 0.1976 0.06328 ⋯ 13.030 31.45 83.90 505.6 0.12040 0.16330 0.06194 0.03264 0.3059 0.07626
10.860 21.48 68.51 360.5 0.07431 0.04227 0.000000 0.000000 0.1661 0.05948 ⋯ 11.660 24.77 74.08 412.3 0.10010 0.07348 0.00000 0.00000 0.2458 0.06592
11.130 22.44 71.49 378.4 0.09566 0.08194 0.048240 0.022570 0.2030 0.06552 ⋯ 12.020 28.26 77.80 436.6 0.10870 0.17820 0.15640 0.06413 0.3169 0.08032
12.770 29.43 81.35 507.9 0.08276 0.04234 0.019970 0.014990 0.1539 0.05637 ⋯ 13.870 36.00 88.10 594.7 0.12340 0.10640 0.08653 0.06498 0.2407 0.06484
9.333 21.94 59.01 264.0 0.09240 0.05605 0.039960 0.012820 0.1692 0.06576 ⋯ 9.845 25.05 62.86 295.8 0.11030 0.08298 0.07993 0.02564 0.2435 0.07393
12.880 28.92 82.50 514.3 0.08123 0.05824 0.061950 0.023430 0.1566 0.05708 ⋯ 13.890 35.74 88.84 595.7 0.12270 0.16200 0.24390 0.06493 0.2372 0.07242
10.290 27.61 65.67 321.4 0.09030 0.07658 0.059990 0.027380 0.1593 0.06127 ⋯ 10.840 34.91 69.57 357.6 0.13840 0.17100 0.20000 0.09127 0.2226 0.08283
10.160 19.59 64.73 311.7 0.10030 0.07504 0.005025 0.011160 0.1791 0.06331 ⋯ 10.650 22.88 67.88 347.3 0.12650 0.12000 0.01005 0.02232 0.2262 0.06742
9.423 27.88 59.26 271.3 0.08123 0.04971 0.000000 0.000000 0.1742 0.06059 ⋯ 10.490 34.24 66.50 330.6 0.10730 0.07158 0.00000 0.00000 0.2475 0.06969
14.590 22.68 96.39 657.1 0.08473 0.13300 0.102900 0.037360 0.1454 0.06147 ⋯ 15.480 27.27 105.90 733.5 0.10260 0.31710 0.36620 0.11050 0.2258 0.08004
11.510 23.93 74.52 403.5 0.09261 0.10210 0.111200 0.041050 0.1388 0.06570 ⋯ 12.480 37.16 82.28 474.2 0.12980 0.25170 0.36300 0.09653 0.2112 0.08732
14.050 27.15 91.38 600.4 0.09929 0.11260 0.044620 0.043040 0.1537 0.06171 ⋯ 15.300 33.17 100.20 706.7 0.12410 0.22640 0.13260 0.10480 0.2250 0.08321
11.200 29.37 70.67 386.0 0.07449 0.03558 0.000000 0.000000 0.1060 0.05502 ⋯ 11.920 38.30 75.19 439.6 0.09267 0.05494 0.00000 0.00000 0.1566 0.05905
15.220 30.62 103.40 716.9 0.10480 0.20870 0.255000 0.094290 0.2128 0.07152 ⋯ 17.520 42.79 128.70 915.0 0.14170 0.79170 1.17000 0.23560 0.4089 0.14090
20.920 25.09 143.00 1347.0 0.10990 0.22360 0.317400 0.147400 0.2149 0.06879 ⋯ 24.290 29.41 179.10 1819.0 0.14070 0.41860 0.65990 0.25420 0.2929 0.09873
21.560 22.39 142.00 1479.0 0.11100 0.11590 0.243900 0.138900 0.1726 0.05623 ⋯ 25.450 26.40 166.10 2027.0 0.14100 0.21130 0.41070 0.22160 0.2060 0.07115
20.130 28.25 131.20 1261.0 0.09780 0.10340 0.144000 0.097910 0.1752 0.05533 ⋯ 23.690 38.25 155.00 1731.0 0.11660 0.19220 0.32150 0.16280 0.2572 0.06637
16.600 28.08 108.30 858.1 0.08455 0.10230 0.092510 0.053020 0.1590 0.05648 ⋯ 18.980 34.12 126.70 1124.0 0.11390 0.30940 0.34030 0.14180 0.2218 0.07820
20.600 29.33 140.10 1265.0 0.11780 0.27700 0.351400 0.152000 0.2397 0.07016 ⋯ 25.740 39.42 184.60 1821.0 0.16500 0.86810 0.93870 0.26500 0.4087 0.12400
7.760 24.54 47.92 181.0 0.05263 0.04362 0.000000 0.000000 0.1587 0.05884 ⋯ 9.456 30.37 59.16 268.6 0.08996 0.06444 0.00000 0.00000 0.2871 0.07039
In [35]:
features <- read.csv('data/breast_cancer_features.csv', header = FALSE)
responses <- read.csv('data/breast_cancer_responses.csv', header = FALSE)
features <- as.matrix(features)
responses <- as.factor(responses$V1)
# data <- parse_data(features, responses, train_split_propn = 0.8, N_obs = 'all', N_features = 20, seed = 2018)
#dim(data$X_train)
#dim(data$X_test)
#length(data$y_train)
#length(data$y_test)
0:dim(features)[2]-1
- -1
- 0
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
In [4]:
library(yaml)
specs <- yaml.load_file('./specs/iRF_mod02.yaml')
print(specs)
spec_comb <- expand.grid(specs)
print(spec_comb)
$inp_dsname
[1] "breast_cancer"
$n_trials
[1] 2
$n_iter
[1] 1
$train_split_propn
[1] 0.8
$n_estimators
[1] 20
$n_bootstraps
[1] 10
$propn_n_samples
[1] 0.8
$bin_class_type
[1] 1
$n_RIT
[1] 20
$max_depth
[1] 5
$noisy_split
[1] FALSE
$num_splits
[1] 2
$n_estimators_bootstrap
[1] 5
$N_obs
[1] "all"
$N_features
[1] "all"
inp_dsname n_trials n_iter train_split_propn n_estimators n_bootstraps
1 breast_cancer 2 1 0.8 20 10
propn_n_samples bin_class_type n_RIT max_depth noisy_split num_splits
1 0.8 1 20 5 FALSE 2
n_estimators_bootstrap N_obs N_features
1 5 all all
In [5]:
spec_comb[1,'n_trials']
2
In [6]:
bm <- get_iRF_benchmarks(data$X_train, data$X_test, data$y_train, data$y_test, n_trials=spec_comb[1,'n_trials'],
K=spec_comb[1,'n_iter'],
n_estimators=spec_comb[1,'n_estimators'],
B=spec_comb[1,'n_bootstraps'],
M=spec_comb[1,'n_RIT'],
max_depth=spec_comb[1,'max_depth'],
noisy_split=spec_comb[1,'noisy_split'],
num_splits=spec_comb[1,'num_splits'],
seed=2018)
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
[1] "AUROC: 0.99"
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
[1] "AUROC: 0.99"
In [7]:
names(bm)
- 'metrics_all'
- 'metrics_summary'
- 'feature_importance'
- 'stability_all'
- 'ff'
In [8]:
iRF_bm <- consolidate_bm_iRF(features, responses, specs,
seed_data_split = None, seed_classifier = None)
inp_dsname n_trials n_iter train_split_propn n_estimators n_bootstraps
1 breast_cancer 2 1 0.8 20 10
propn_n_samples bin_class_type n_RIT max_depth noisy_split num_splits
1 0.8 1 20 5 FALSE 2
n_estimators_bootstrap N_obs N_features
1 5 all all
inp_dsname n_trials n_iter train_split_propn n_estimators n_bootstraps
1 breast_cancer 2 1 0.8 20 10
propn_n_samples bin_class_type n_RIT max_depth noisy_split num_splits
1 0.8 1 20 5 FALSE 2
n_estimators_bootstrap N_obs N_features
1 5 all all
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
[1] "AUROC: 0.97"
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
[1] "AUROC: 0.98"
In [12]:
iRF_bm[[1]]
$metrics_all
$metrics_all$times
[1] 5.905103 5.866463
$metrics_all$score
[1] 0.9473684 0.9561404
$metrics_summary
$metrics_summary$times
[1] 5.88578320 0.02732262
$metrics_summary$score
[1] 0.951754386 0.006202691
$feature_importance
$feature_importance[[1]]
MeanDecreaseGini
V1 9.4209773
V2 1.1349823
V3 6.9509956
V4 14.5629329
V5 0.3142857
V6 0.3600000
V7 9.6105466
V8 17.6806820
V9 0.7469826
V10 0.8977865
V11 2.0444174
V12 1.9886143
V13 5.3231042
V14 2.7276655
V15 1.3999610
V16 0.8464013
V17 0.4815344
V18 0.3573983
V19 1.1549009
V20 1.5742178
V21 21.9348837
V22 3.5119554
V23 62.5072063
V24 13.9325965
V25 2.3505814
V26 4.2003096
V27 0.9440588
V28 19.9942422
V29 1.8027630
V30 1.0777416
$feature_importance[[2]]
MeanDecreaseGini
V1 12.4480459
V2 2.3360672
V3 1.9584065
V4 21.4260223
V5 0.8622394
V6 6.3158302
V7 3.6466620
V8 29.0991918
V9 0.3668440
V10 0.2636390
V11 1.3454501
V12 1.6110228
V13 7.7898300
V14 10.8271985
V15 0.9104149
V16 0.9216757
V17 0.5357604
V18 0.0000000
V19 1.0801801
V20 1.4918053
V21 19.0172308
V22 5.7622887
V23 29.1551352
V24 19.9679582
V25 1.1491682
V26 5.0161659
V27 3.5954133
V28 22.3604443
V29 1.5617773
V30 0.1500000
$stability_all
$stability_all[[1]]
V21 V28 V24 V23 V27 V4 V8
1.0 1.0 0.9 0.8 0.8 0.8 0.8
V14 V7 V1 V3 V22 V26 V11
0.7 0.7 0.5 0.5 0.4 0.4 0.3
V13 V23_V28 V7_V24 V1_V23 V1_V28 V2 V21_V24
0.3 0.3 0.3 0.2 0.2 0.2 0.2
V4_V7 V7_V21 V8_V23 V11_V21 V11_V28 V12_V26 V1_V23_V26
0.2 0.2 0.2 0.1 0.1 0.1 0.1
V1_V27 V1_V27_V28 V14_V23 V1_V8 V19 V19_V28 V21_V23
0.1 0.1 0.1 0.1 0.1 0.1 0.1
V21_V27 V21_V28 V21_V29 V22_V28 V23_V27 V24_V26 V24_V27
0.1 0.1 0.1 0.1 0.1 0.1 0.1
V24_V28 V24_V30 V25 V27_V28 V29 V30 V3_V23
0.1 0.1 0.1 0.1 0.1 0.1 0.1
V3_V24 V3_V28 V4_V21 V4_V23 V4_V27 V4_V28 V6
0.1 0.1 0.1 0.1 0.1 0.1 0.1
V6_V8 V7_V28 V8_V11 V8_V19 V8_V21
0.1 0.1 0.1 0.1 0.1
$stability_all[[2]]
V28 V21 V24 V8 V23 V14
1.0 0.9 0.9 0.9 0.8 0.7
V27 V13 V29 V3 V4 V22
0.7 0.5 0.5 0.5 0.5 0.4
V26 V11 V21_V28 V24_V27 V4_V28 V1
0.4 0.3 0.3 0.3 0.3 0.2
V12 V14_V23 V19 V21_V24 V24_V28 V6
0.2 0.2 0.2 0.2 0.2 0.2
V7 V11_V23 V11_V28 V12_V21 V12_V23 V12_V24
0.2 0.1 0.1 0.1 0.1 0.1
V13_V21 V13_V21_V24 V13_V23 V13_V24 V14_V24 V14_V24_V27
0.1 0.1 0.1 0.1 0.1 0.1
V14_V24_V28 V14_V26 V14_V27 V14_V28 V15 V16_V24
0.1 0.1 0.1 0.1 0.1 0.1
V17 V19_V21 V19_V28 V2 V21_V23_V28 V21_V27
0.1 0.1 0.1 0.1 0.1 0.1
V22_V24_V26 V23_V28 V23_V29 V25 V3_V23 V3_V4
0.1 0.1 0.1 0.1 0.1 0.1
V3_V8 V4_V24 V7_V13 V7_V21 V7_V22 V8_V14
0.1 0.1 0.1 0.1 0.1 0.1
V8_V26 V8_V27
0.1 0.1
$ff
$ff$rf_list
$ff$rf_list[[1]]
Call:
randomForest(x = x, y = y, xtest = xtest, ytest = ytest, ntree = nt, mtry_select_prob = mtry_select_prob, keep_subset_var = keep_subset_var, keep.forest = TRUE)
Type of random forest: classification
Number of trees: 20
No. of variables tried at each split: 5
$ff$interaction
$ff$interaction[[1]]
V28 V21 V24 V8 V23 V14
1.0 0.9 0.9 0.9 0.8 0.7
V27 V13 V29 V3 V4 V22
0.7 0.5 0.5 0.5 0.5 0.4
V26 V11 V21_V28 V24_V27 V4_V28 V1
0.4 0.3 0.3 0.3 0.3 0.2
V12 V14_V23 V19 V21_V24 V24_V28 V6
0.2 0.2 0.2 0.2 0.2 0.2
V7 V11_V23 V11_V28 V12_V21 V12_V23 V12_V24
0.2 0.1 0.1 0.1 0.1 0.1
V13_V21 V13_V21_V24 V13_V23 V13_V24 V14_V24 V14_V24_V27
0.1 0.1 0.1 0.1 0.1 0.1
V14_V24_V28 V14_V26 V14_V27 V14_V28 V15 V16_V24
0.1 0.1 0.1 0.1 0.1 0.1
V17 V19_V21 V19_V28 V2 V21_V23_V28 V21_V27
0.1 0.1 0.1 0.1 0.1 0.1
V22_V24_V26 V23_V28 V23_V29 V25 V3_V23 V3_V4
0.1 0.1 0.1 0.1 0.1 0.1
V3_V8 V4_V24 V7_V13 V7_V21 V7_V22 V8_V14
0.1 0.1 0.1 0.1 0.1 0.1
V8_V26 V8_V27
0.1 0.1
In [20]:
n.cores <- 7
registerDoMC(n.cores)
ff <- iRF(x=features, y=responses,
find_interaction=TRUE, n_iter=1, n.cores = n.cores, ntree = 20, n.bootstrap =20)
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
b = 11;
b = 12;
b = 13;
b = 14;
b = 15;
b = 16;
b = 17;
b = 18;
b = 19;
b = 20;
b = 21;
b = 22;
b = 23;
b = 24;
b = 25;
b = 26;
b = 27;
b = 28;
b = 29;
b = 30;
In [29]:
a <- parse_data(features, responses, train_split_propn = 0.8, N_obs = 'all', N_features = 'all', seed = 2018)
In [31]:
bm <- get_iRF_benchmarks(a$X_train, a$X_test, a$y_train, a$y_test, n_trials=1,
K=specs[1,'n_iter'],
n_estimators=20,
B=30,
M=20,
max_depth=5,
noisy_split=False,
num_splits=2,
seed=2018)
Error in specs[1, "n_iter"]: incorrect number of dimensions
Traceback:
1. get_iRF_benchmarks(a$X_train, a$X_test, a$y_train, a$y_test,
. n_trials = 1, K = specs[1, "n_iter"], n_estimators = 20,
. B = 30, M = 20, max_depth = 5, noisy_split = False, num_splits = 2,
. seed = 2018)
2. iRF(x = X_train, y = y_train, xtest = X_test, ytest = y_test,
. n_iter = K, ntree = n_estimators, n_core = 3, find_interaction = TRUE,
. class_id = 1, cutoff_nodesize_prop = 0.1, n_bootstrap = B,
. verbose = TRUE)
In [30]:
bm <- get_iRF_benchmarks(a$X_train, a$X_test, a$y_train, a$y_test, n_trials=1,
K=1,
n_estimators=20,
B=30,
M=20,
max_depth=5,
noisy_split=False,
num_splits=2,
seed=2018)
[1] "iteration = 1"
finding interactions ... b = 1;
b = 2;
b = 3;
b = 4;
b = 5;
b = 6;
b = 7;
b = 8;
b = 9;
b = 10;
b = 11;
b = 12;
b = 13;
b = 14;
b = 15;
b = 16;
b = 17;
b = 18;
b = 19;
b = 20;
b = 21;
b = 22;
b = 23;
b = 24;
b = 25;
b = 26;
b = 27;
b = 28;
b = 29;
b = 30;
[1] "AUROC: 0.98"
In [23]:
names(bm)
- 'metrics_all'
- 'metrics_summary'
- 'feature_importance'
- 'stability_all'
- 'ff'
In [ ]:
bm$feature_importance
In [3]:
library(yaml)
In [18]:
specs <- yaml.load_file('./specs/iRF_mod01.yaml')
In [19]:
specs$inp_dsname
'breast_cancer'
In [11]:
b <- expand.grid(specs)
In [7]:
b[1,]
inp_dsname n_trials n_iter train_split_propn n_estimators n_bootstraps propn_n_samples bin_class_type n_RIT max_depth noisy_split num_splits n_estimators_bootstrap N_obs N_features
breast_cancer 5 5 0.8 2 20 0.2 1 20 5 FALSE 2 5 all all
In [79]:
b[1,'N_obs']
Error in eval(expr, envir, enclos): object 'b' not found
Traceback:
In [36]:
b[1,2]
20
In [51]:
test = 'all'
if(test=='no'){print('hi')
}
else{print('bye')}
Error in parse(text = x, srcfile = src): <text>:4:1: unexpected 'else'
3: }
4: else
^
Traceback:
In [16]:
paste('hi', 'bye', sep = '')
'hibye'
In [42]:
load('./output/iRF_mod02_out.RData')
In [43]:
rf_bm
[[1]]
[[1]]$metrics_all
[[1]]$metrics_all$times
[1] 5.619249 6.059816
[[1]]$metrics_all$score
[1] 0.9473684 0.9473684
[[1]]$metrics_summary
[[1]]$metrics_summary$times
[1] 5.8395323 0.3115281
[[1]]$metrics_summary$score
[1] 0.9473684 0.0000000
[[1]]$feature_importance
[[1]]$feature_importance[[1]]
MeanDecreaseGini
V1 15.8397248
V2 1.3578702
V3 19.9585683
V4 8.4877189
V5 0.9570862
V6 0.8754886
V7 4.6984977
V8 24.2386461
V9 0.8418733
V10 0.8685278
V11 1.3405714
V12 0.4104433
V13 1.8175243
V14 8.3923767
V15 0.8155324
V16 3.5467466
V17 0.9885462
V18 0.2649471
V19 0.8623872
V20 0.9334602
V21 39.1284810
V22 4.5648913
V23 22.2339201
V24 21.1010298
V25 0.9897511
V26 3.2404513
V27 1.6735149
V28 13.9175589
V29 3.9884692
V30 1.2601204
[[1]]$feature_importance[[2]]
MeanDecreaseGini
V1 8.4115846
V2 2.4405166
V3 7.7185351
V4 12.2528821
V5 0.6795935
V6 1.8782587
V7 13.6126204
V8 27.1088403
V9 0.9508944
V10 0.6301316
V11 1.5826359
V12 0.8637404
V13 2.0123540
V14 8.7825479
V15 0.6598211
V16 0.3898140
V17 0.0000000
V18 1.2451425
V19 0.8330975
V20 0.4550929
V21 12.3703744
V22 2.7956846
V23 38.4184859
V24 23.1742178
V25 1.3927473
V26 3.7790024
V27 8.5905775
V28 28.5595492
V29 0.6160145
V30 0.7196385
[[1]]$stability_all
[[1]]$stability_all[[1]]
V21 V23 V28 V24 V8 V4 V1 V13 V27 V3
1.0 1.0 1.0 0.9 0.9 0.7 0.5 0.5 0.5 0.5
V14 V26 V19 V21_V28 V22 V7 V11 V1_V28 V2 V23_V24
0.4 0.4 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2
V24_V28 V25 V27_V28 V6 V1_V27 V13_V24 V14_V28 V16 V19_V24 V21_V23
0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1
V21_V24 V21_V27 V23_V28 V24_V26 V30 V3_V22 V3_V28 V4_V21 V4_V26 V8_V14
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
[[1]]$stability_all[[2]]
V21 V28 V23 V24 V1
1.0 1.0 0.9 0.8 0.6
V14 V8 V27 V29 V3
0.6 0.6 0.5 0.5 0.5
V13 V25 V26 V4 V7
0.4 0.4 0.4 0.4 0.4
V11 V14_V23 V14_V28 V21_V27 V21_V28
0.3 0.3 0.3 0.3 0.3
V22 V23_V28 V24_V28 V1_V23 V14_V21
0.3 0.3 0.3 0.2 0.2
V14_V27 V19 V19_V24 V21_V23 V23_V27
0.2 0.2 0.2 0.2 0.2
V27_V28 V30 V4_V24 V6 V11_V12
0.2 0.2 0.2 0.2 0.1
V11_V27 V11_V28 V1_V14 V12 V1_V28
0.1 0.1 0.1 0.1 0.1
V14_V21_V23 V1_V8_V14_V21 V19_V23 V2 V21_V24
0.1 0.1 0.1 0.1 0.1
V22_V28 V23_V24 V23_V25 V23_V25_V28 V23_V29
0.1 0.1 0.1 0.1 0.1
V23_V30 V24_V29 V25_V27 V25_V28 V28_V29
0.1 0.1 0.1 0.1 0.1
V3_V24 V3_V27 V4_V21 V4_V28 V4_V8
0.1 0.1 0.1 0.1 0.1
V8_V19 V8_V21 V8_V24
0.1 0.1 0.1
[[1]]$rf
Call:
randomForest(x = x, y = y, xtest = xtest, ytest = ytest, ntree = nt, mtry_select_prob = mtry_select_prob, keep_subset_var = keep_subset_var, keep.forest = TRUE)
Type of random forest: classification
Number of trees: 20
No. of variables tried at each split: 5
In [ ]:
Content source: Yu-Group/scikit-learn-sandbox
Similar notebooks: