Comparison of the GBM and Adabag boosting configuration

Within PA, currently the R package adabag is used for boosted Trees. The document provides the results of an analysis between the Adabag and GBM R packages, comparing the configuration parameters. The results show...

What is boosting?

  • Goal: Imporve the accuracy of the ensemble, combining classifiers which are as precise and difference as possible. In boosting the base classifier of each boosting iteration depends on the previous classifers through a updating process. For adabag, through Adaboost the weights are updated, with gbm the gradients are updated.

With eacxh boosting iteration (step), the learning attention of the classifier is applied to the training data, but the learning focus is on different examples of the set through adaptive weighting (saids this from adabag paper). When all iterations are complete, the classifiers produced in each iteration are combined into a final classifier based on the training data. Adaboost can only be applied to binary classification problems. Though, Adaboost.M! and SAMME can be applied to multclass classification problems.

Margins and boosting

The concept of the margin is important. The margin is related to the certainty of its classification. It is the difference between the support of the correc class an the maximum support of an incorrect class.

All wrong classified examples will have negative margins, and correctly classified ones postive margins. A correctly classified observation with a high degree of confidence will have margins close to one. On the other hand, examples with an uncertain classification will have small margins, that is, margins close to zero. Since a small margin is an instability symptom in the assigned class, the same example could be assigned to different classes by similar classifiers.

A paper on Margins, that I must read is: Kuncheva LI (2004). Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons. It creates a visualisation, using a margin dsitribution generating a cumulative distributions of margins.

The Packages

Adabag

Provides both boosting and bagging capabilities. For boosting it is only suitable for dichotomous tasks. It implements the AdaBoost.M1 and SAMME algorithms. The classification trees are base classifiers.

Interesting point, the calculated weights that get applied to the next classifier using the previous output of the classifier, are always summed to one, and the weights are applied in a wway that the weight for a wrongly classified observeraion is increaed. what does this mean, it means it forces the classifier in the next iteraiton to focus on the hardest example.

Even better, when the classifier is reasonably accuray, the differences in the weight updates for the next classifier will be greater, as there will be fewer mistakes to focus on, and these take on more importance.

Plis, the alapha constant is used in the final decision rule giving more important to the classifiers that made a lower error (not clear).

Accuracy is estimated in a separated dataset, or though cross validation.

Adabag provide boosting only for classification tasks, that is regression is not available. With Adabag, the difference between Adaboost.M1 and SAMME looks to be the the way alpha constant (the learning rate) is calculated.

Functions Provided by the package

  • boosting - builds the boosting classifier, and classify the samples in the training set. Can use AdaBoost.M1 or SAMME. There are six arguments - formula (as normal in R), data (the data), boos (logical param, TRUE by default) a bootstrap sample of the training set is drawn using the weight for each observation on that iteration. mfinal - sets the number of iterations for which boosting is run or the number of trees to use. coeflearn, this looks to control the algorithm applied, Breiman/Freund apply AdaBoost.M1 with different alpha calculations, while Zhu applies SAMME. IT outputs an object 'boosting'.
  • predict.boosting - predict the new class of new samples using the previoulsy trained classifer (ensemble)
  • boosting.cv - one that estimates by cross-validation the accuracy of the classifiers in a data set.

Note, the boosting object returned has an importance vector. This returns the relative importantce/contriution of each variable in the classification task. So it allows quantifying the relative importance of the preditor variables.

Analysis

  • margins
  • evolerror

Note, there was comment enabling the margin of the class prediction for each observation and the error evolutino to be calculated. We will show this later.

GBM

Provides a boosting framework through which mulitple loss functions can be applied. This enables the creation of regression/binary classification/multi classification classifiers. The gbm enables the application of the adaboost.m1 through the use of the adaboost loss function. That is, ababoost, can be seen as a specialisation of Gradient Descent Boosting.

Examples

Lets look at the execution of boosting using adabag package


In [3]:
library("adabag")

#  load the data set required, in this case iris
data("iris")

# break the data into training
train <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25))

# create the classifier, 10 trees and depth of 1 to predict the Species 
# (categorical variable) making this multi-classification
iris.adaboost <- boosting(Species ~ ., data = iris[train, ], mfinal = 10, control = rpart.control(maxdepth = 1))

# lets view the data produced by the boost object
iris.adaboost


$formula
Species ~ .

$trees
$trees[[1]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 48 virginica (0.3466667 0.2933333 0.3600000)  
  2) Petal.Length< 2.5 26  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.5 49 22 virginica (0.0000000 0.4489796 0.5510204) *

$trees[[2]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 39 versicolor (0.2533333 0.4800000 0.2666667)  
  2) Petal.Length< 2.35 19  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.35 56 20 versicolor (0.0000000 0.6428571 0.3571429) *

$trees[[3]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 41 virginica (0.25333333 0.29333333 0.45333333)  
  2) Petal.Length< 4.75 39 19 versicolor (0.48717949 0.51282051 0.00000000) *
  3) Petal.Length>=4.75 36  2 virginica (0.00000000 0.05555556 0.94444444) *

$trees[[4]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 45 setosa (0.4000000 0.2400000 0.3600000)  
  2) Petal.Length< 2.6 30  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.6 45 18 virginica (0.0000000 0.4000000 0.6000000) *

$trees[[5]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 46 virginica (0.28000000 0.33333333 0.38666667)  
  2) Petal.Width< 1.75 48 23 versicolor (0.43750000 0.52083333 0.04166667) *
  3) Petal.Width>=1.75 27  0 virginica (0.00000000 0.00000000 1.00000000) *

$trees[[6]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 42 versicolor (0.3733333 0.4400000 0.1866667)  
  2) Petal.Length< 2.45 28  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.45 47 14 versicolor (0.0000000 0.7021277 0.2978723) *

$trees[[7]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 39 virginica (0.28000000 0.24000000 0.48000000)  
  2) Petal.Width< 1.45 36 15 setosa (0.58333333 0.41666667 0.00000000) *
  3) Petal.Width>=1.45 39  3 virginica (0.00000000 0.07692308 0.92307692) *

$trees[[8]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 44 versicolor (0.3333333 0.4133333 0.2533333)  
  2) Petal.Length< 2.6 25  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.6 50 19 versicolor (0.0000000 0.6200000 0.3800000) *

$trees[[9]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 44 virginica (0.21333333 0.37333333 0.41333333)  
  2) Petal.Length< 4.8 41 16 versicolor (0.39024390 0.60975610 0.00000000) *
  3) Petal.Length>=4.8 34  3 virginica (0.00000000 0.08823529 0.91176471) *

$trees[[10]]
n= 75 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

1) root 75 49 versicolor (0.3200000 0.3466667 0.3333333)  
  2) Petal.Length< 2.5 24  0 setosa (1.0000000 0.0000000 0.0000000) *
  3) Petal.Length>=2.5 51 25 versicolor (0.0000000 0.5098039 0.4901961) *


$weights
 [1] 0.3465736 0.4406868 0.4327477 0.3758529 0.3192924 0.4623739 0.4029825
 [8] 0.3426498 0.4083741 0.2581778

$votes
           [,1]     [,2]      [,3]
 [1,] 2.6292974 1.160414 0.0000000
 [2,] 2.6292974 1.160414 0.0000000
 [3,] 2.6292974 1.160414 0.0000000
 [4,] 2.6292974 1.160414 0.0000000
 [5,] 2.6292974 1.160414 0.0000000
 [6,] 2.6292974 1.160414 0.0000000
 [7,] 2.6292974 1.160414 0.0000000
 [8,] 2.6292974 1.160414 0.0000000
 [9,] 2.6292974 1.160414 0.0000000
[10,] 2.6292974 1.160414 0.0000000
[11,] 2.6292974 1.160414 0.0000000
[12,] 2.6292974 1.160414 0.0000000
[13,] 2.6292974 1.160414 0.0000000
[14,] 2.6292974 1.160414 0.0000000
[15,] 2.6292974 1.160414 0.0000000
[16,] 2.6292974 1.160414 0.0000000
[17,] 2.6292974 1.160414 0.0000000
[18,] 2.6292974 1.160414 0.0000000
[19,] 2.6292974 1.160414 0.0000000
[20,] 2.6292974 1.160414 0.0000000
[21,] 2.6292974 1.160414 0.0000000
[22,] 2.6292974 1.160414 0.0000000
[23,] 2.6292974 1.160414 0.0000000
[24,] 2.6292974 1.160414 0.0000000
[25,] 2.6292974 1.160414 0.0000000
[26,] 0.0000000 2.664303 1.1254090
[27,] 0.4029825 2.664303 0.7224265
[28,] 0.4029825 2.664303 0.7224265
[29,] 0.4029825 2.664303 0.7224265
[30,] 0.4029825 2.664303 0.7224265
[31,] 0.4029825 2.664303 0.7224265
[32,] 0.0000000 1.823181 1.9665309
[33,] 0.4029825 2.664303 0.7224265
[34,] 0.4029825 2.664303 0.7224265
[35,] 0.4029825 2.664303 0.7224265
[36,] 0.4029825 2.664303 0.7224265
[37,] 0.4029825 2.664303 0.7224265
[38,] 0.4029825 2.664303 0.7224265
[39,] 0.4029825 2.664303 0.7224265
[40,] 0.0000000 2.664303 1.1254090
[41,] 0.4029825 2.664303 0.7224265
[42,] 0.0000000 2.664303 1.1254090
[43,] 0.0000000 2.664303 1.1254090
[44,] 0.4029825 2.664303 0.7224265
[45,] 0.0000000 1.823181 1.9665309
[46,] 0.4029825 2.664303 0.7224265
[47,] 0.4029825 2.664303 0.7224265
[48,] 0.4029825 2.664303 0.7224265
[49,] 0.0000000 2.664303 1.1254090
[50,] 0.0000000 2.664303 1.1254090
[51,] 0.0000000 1.503888 2.2858233
[52,] 0.0000000 1.503888 2.2858233
[53,] 0.0000000 1.823181 1.9665309
[54,] 0.0000000 1.503888 2.2858233
[55,] 0.0000000 1.503888 2.2858233
[56,] 0.0000000 1.503888 2.2858233
[57,] 0.0000000 1.503888 2.2858233
[58,] 0.0000000 1.823181 1.9665309
[59,] 0.0000000 1.503888 2.2858233
[60,] 0.0000000 1.503888 2.2858233
[61,] 0.0000000 1.503888 2.2858233
[62,] 0.0000000 1.503888 2.2858233
[63,] 0.0000000 1.503888 2.2858233
[64,] 0.0000000 1.503888 2.2858233
[65,] 0.0000000 1.503888 2.2858233
[66,] 0.0000000 1.503888 2.2858233
[67,] 0.0000000 1.503888 2.2858233
[68,] 0.0000000 1.503888 2.2858233
[69,] 0.0000000 1.823181 1.9665309
[70,] 0.0000000 1.503888 2.2858233
[71,] 0.0000000 2.664303 1.1254090
[72,] 0.0000000 1.503888 2.2858233
[73,] 0.0000000 1.503888 2.2858233
[74,] 0.0000000 1.503888 2.2858233
[75,] 0.0000000 1.503888 2.2858233

$prob
           [,1]      [,2]      [,3]
 [1,] 0.6937988 0.3062012 0.0000000
 [2,] 0.6937988 0.3062012 0.0000000
 [3,] 0.6937988 0.3062012 0.0000000
 [4,] 0.6937988 0.3062012 0.0000000
 [5,] 0.6937988 0.3062012 0.0000000
 [6,] 0.6937988 0.3062012 0.0000000
 [7,] 0.6937988 0.3062012 0.0000000
 [8,] 0.6937988 0.3062012 0.0000000
 [9,] 0.6937988 0.3062012 0.0000000
[10,] 0.6937988 0.3062012 0.0000000
[11,] 0.6937988 0.3062012 0.0000000
[12,] 0.6937988 0.3062012 0.0000000
[13,] 0.6937988 0.3062012 0.0000000
[14,] 0.6937988 0.3062012 0.0000000
[15,] 0.6937988 0.3062012 0.0000000
[16,] 0.6937988 0.3062012 0.0000000
[17,] 0.6937988 0.3062012 0.0000000
[18,] 0.6937988 0.3062012 0.0000000
[19,] 0.6937988 0.3062012 0.0000000
[20,] 0.6937988 0.3062012 0.0000000
[21,] 0.6937988 0.3062012 0.0000000
[22,] 0.6937988 0.3062012 0.0000000
[23,] 0.6937988 0.3062012 0.0000000
[24,] 0.6937988 0.3062012 0.0000000
[25,] 0.6937988 0.3062012 0.0000000
[26,] 0.0000000 0.7030357 0.2969643
[27,] 0.1063359 0.7030357 0.1906284
[28,] 0.1063359 0.7030357 0.1906284
[29,] 0.1063359 0.7030357 0.1906284
[30,] 0.1063359 0.7030357 0.1906284
[31,] 0.1063359 0.7030357 0.1906284
[32,] 0.0000000 0.4810869 0.5189131
[33,] 0.1063359 0.7030357 0.1906284
[34,] 0.1063359 0.7030357 0.1906284
[35,] 0.1063359 0.7030357 0.1906284
[36,] 0.1063359 0.7030357 0.1906284
[37,] 0.1063359 0.7030357 0.1906284
[38,] 0.1063359 0.7030357 0.1906284
[39,] 0.1063359 0.7030357 0.1906284
[40,] 0.0000000 0.7030357 0.2969643
[41,] 0.1063359 0.7030357 0.1906284
[42,] 0.0000000 0.7030357 0.2969643
[43,] 0.0000000 0.7030357 0.2969643
[44,] 0.1063359 0.7030357 0.1906284
[45,] 0.0000000 0.4810869 0.5189131
[46,] 0.1063359 0.7030357 0.1906284
[47,] 0.1063359 0.7030357 0.1906284
[48,] 0.1063359 0.7030357 0.1906284
[49,] 0.0000000 0.7030357 0.2969643
[50,] 0.0000000 0.7030357 0.2969643
[51,] 0.0000000 0.3968345 0.6031655
[52,] 0.0000000 0.3968345 0.6031655
[53,] 0.0000000 0.4810869 0.5189131
[54,] 0.0000000 0.3968345 0.6031655
[55,] 0.0000000 0.3968345 0.6031655
[56,] 0.0000000 0.3968345 0.6031655
[57,] 0.0000000 0.3968345 0.6031655
[58,] 0.0000000 0.4810869 0.5189131
[59,] 0.0000000 0.3968345 0.6031655
[60,] 0.0000000 0.3968345 0.6031655
[61,] 0.0000000 0.3968345 0.6031655
[62,] 0.0000000 0.3968345 0.6031655
[63,] 0.0000000 0.3968345 0.6031655
[64,] 0.0000000 0.3968345 0.6031655
[65,] 0.0000000 0.3968345 0.6031655
[66,] 0.0000000 0.3968345 0.6031655
[67,] 0.0000000 0.3968345 0.6031655
[68,] 0.0000000 0.3968345 0.6031655
[69,] 0.0000000 0.4810869 0.5189131
[70,] 0.0000000 0.3968345 0.6031655
[71,] 0.0000000 0.7030357 0.2969643
[72,] 0.0000000 0.3968345 0.6031655
[73,] 0.0000000 0.3968345 0.6031655
[74,] 0.0000000 0.3968345 0.6031655
[75,] 0.0000000 0.3968345 0.6031655

$class
 [1] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
 [6] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
[11] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
[16] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
[21] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
[26] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
[31] "versicolor" "virginica"  "versicolor" "versicolor" "versicolor"
[36] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
[41] "versicolor" "versicolor" "versicolor" "versicolor" "virginica" 
[46] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
[51] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
[56] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
[61] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
[66] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
[71] "versicolor" "virginica"  "virginica"  "virginica"  "virginica" 

$importance
Petal.Length  Petal.Width Sepal.Length  Sepal.Width 
    81.50077     18.49923      0.00000      0.00000 

$terms
Species ~ pesos + Sepal.Length + Sepal.Width + Petal.Length + 
    Petal.Width
attr(,"variables")
list(Species, pesos, Sepal.Length, Sepal.Width, Petal.Length, 
    Petal.Width)
attr(,"factors")
             pesos Sepal.Length Sepal.Width Petal.Length Petal.Width
Species          0            0           0            0           0
pesos            1            0           0            0           0
Sepal.Length     0            1           0            0           0
Sepal.Width      0            0           1            0           0
Petal.Length     0            0           0            1           0
Petal.Width      0            0           0            0           1
attr(,"term.labels")
[1] "pesos"        "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
attr(,"order")
[1] 1 1 1 1 1
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: R_GlobalEnv>
attr(,"predvars")
list(Species, pesos, Sepal.Length, Sepal.Width, Petal.Length, 
    Petal.Width)
attr(,"dataClasses")
     Species        pesos Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
    "factor"    "numeric"    "numeric"    "numeric"    "numeric"    "numeric" 

$call
boosting(formula = Species ~ ., data = iris[train, ], mfinal = 10, 
    control = rpart.control(maxdepth = 1))

attr(,"vardep.summary")
    setosa versicolor  virginica 
        25         25         25 
attr(,"class")
[1] "boosting"

In [6]:
# build a confusion matrix
table(iris.adaboost$class, iris$Species[train], dnn = c("Predicted Class", "Observed Class"))


               Observed Class
Predicted Class setosa versicolor virginica
     setosa         25          0         0
     versicolor      0         23         1
     virginica       0          2        24

In [7]:
# calculate the error rate of the training sample
1 - sum(iris.adaboost$class == iris$Species[train]) / length(iris$Species[train])


0.04

Making Predictions

Making predictions with the adabag object is different to the gbm. There are some simliarities but also some preferencest to gbm.

Both allow for newdata to be added, and take the respective ensemble model as a parameters. The new data added in both cases contains the values for which the predictionsa re required, and should contain the predictive features. Finally, there is a newmFinal optionm fixnig the number of trees to be used. It allows pruning but does not provide a recommended bumeber of tree, by default all the trees in the obejct are used.

Predictions with Adabag boosting

Dependencies

  • R Packages

    • gbm
    • rpart
    • caret
  • R data packages

    • mlbench

Sources

References

Alfaro, E., Gámez, M. and Garcia, N., 2013. Adabag: An R package for classification with boosting and bagging. Journal of Statistical Software, 54(2), pp.1-35.