Title: Gaussian Naive Bayes Classifier
Slug: gaussian_naive_bayes_classifier
Summary: How to train a Gaussian naive bayes classifer in Scikit-Learn
Date: 2017-09-22 12:00
Category: Machine Learning
Tags: Naive Bayes
Authors: Chris Albon
Because of the assumption of the normal distribution, Gaussian Naive Bayes is best used in cases when all our features are continuous.
In [1]:
# Load libraries
from sklearn import datasets
from sklearn.naive_bayes import GaussianNB
In [2]:
# Load data
iris = datasets.load_iris()
X = iris.data
y = iris.target
In [3]:
# Create Gaussian Naive Bayes object with prior probabilities of each class
clf = GaussianNB(priors=[0.25, 0.25, 0.5])
# Train model
model = clf.fit(X, y)
In [4]:
# Create new observation
new_observation = [[ 4, 4, 4, 0.4]]
In [5]:
# Predict class
model.predict(new_observation)
Out[5]:
Note: the raw predicted probabilities from Gaussian naive Bayes (outputted using predict_proba
) are not calibrated. That is, they should not be believed. If we want to create useful predicted probabilities we will need to calibrate them using an isotonic regression or a related method.