Title: Gaussian Naive Bayes Classifier
Slug: gaussian_naive_bayes_classifier
Summary: How to train a Gaussian naive bayes classifer in Scikit-Learn
Date: 2017-09-22 12:00
Category: Machine Learning
Tags: Naive Bayes
Authors: Chris Albon
Because of the assumption of the normal distribution, Gaussian Naive Bayes is best used in cases when all our features are continuous.
In [1]:
    
# Load libraries
from sklearn import datasets
from sklearn.naive_bayes import GaussianNB
    
In [2]:
    
# Load data
iris = datasets.load_iris()
X = iris.data
y = iris.target
    
In [3]:
    
# Create Gaussian Naive Bayes object with prior probabilities of each class
clf = GaussianNB(priors=[0.25, 0.25, 0.5])
# Train model
model = clf.fit(X, y)
    
In [4]:
    
# Create new observation
new_observation = [[ 4,  4,  4,  0.4]]
    
In [5]:
    
# Predict class
model.predict(new_observation)
    
    Out[5]:
Note: the raw predicted probabilities from Gaussian naive Bayes (outputted using predict_proba) are not calibrated. That is, they should not be believed. If we want to create useful predicted probabilities we will need to calibrate them using an isotonic regression or a related method.