Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations. If you find this content useful, please consider supporting the work by buying the book!
So far, we have looked at a number of interesting machine learning algorithms, from classic methods such as linear regression to more advanced techniques such as deep neural networks. At various points, we pointed out that every algorithm has its own strengths and weaknesses—and we took note of how to spot and overcome these weaknesses.
However, wouldn't it be great if we could simply stack together a bunch of average classifiers to form a much stronger ensemble of classifiers?
In this chapter, we will do just that. Ensemble methods are techniques that bind multiple different models together in order to solve a shared problem. Their use has become a common practice in competitive machine learning—making use of an ensemble typically improves an individual classifier's performance by a small percentage.
Among these techniques are the so-called bagging methods, where the vote of multiple classifiers is averaged to make a final decision, and boosting methods, where one classifier is trying to rectify the errors made by another. One of these methods is known as random forest, which is a combination of multiple decision trees. In addition, you might already be familiar with Adaptive Boosting (also known as AdaBoost), a powerful boosting technique and popular feature of OpenCV.
As we progress through the chapter, we want to find answers to the following questions:
Let's jump right in!