Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations. If you find this content useful, please consider supporting the work by buying the book!
Before we get to grips with advanced topics, such as cluster analysis, deep learning, and ensemble models, let's turn our attention to a much simpler model that we have overlooked so far: the naive Bayes classifier.
Naive Bayes classifiers have their roots in Bayesian inference, named after famed statistician and philosopher Thomas Bayes (1701-1761). Bayes' theorem famously describes the probability of an event based on prior knowledge of conditions that might lead to the event. We can use Bayes' theorem to build a statistical model that can not only classify data but also provide us with an estimate of how likely it is that our classification is correct. In our case, we can use Bayesian inference to dismiss an email as spam with high confidence, and to determine the probability of a woman having breast cancer, given a positive screening test.
We have now gained enough experience with the mechanics of implementing machine learning methods, and so we should no longer be afraid to try and understand the theory behind them. Don't worry, we won't write a book on it (ahem), but we need some understanding of the theory to appreciate a model's inner workings. After that, I am sure you will find that Bayesian classifiers are easy to implement, computationally efficient, and tend to perform quite well on relatively small datasets.
Along the way, we also want to address the following questions: