Title: Multinomial Naive Bayes Classifier
Slug: multinomial_naive_bayes_classifier
Summary: How to train a Multinomial naive bayes classifer in Scikit-Learn
Date: 2017-09-22 12:00
Category: Machine Learning
Tags: Naive Bayes
Authors: Chris Albon
Multinomial naive Bayes works similar to Gaussian naive Bayes, however the features are assumed to be multinomially distributed. In practice, this means that this classifier is commonly used when we have discrete data (e.g. movie ratings ranging 1 and 5).
In [1]:
# Load libraries
import numpy as np
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
In [2]:
# Create text
text_data = np.array(['I love Brazil. Brazil!',
'Brazil is best',
'Germany beats both'])
In [3]:
# Create bag of words
count = CountVectorizer()
bag_of_words = count.fit_transform(text_data)
# Create feature matrix
X = bag_of_words.toarray()
In [4]:
# Create target vector
y = np.array([0,0,1])
In [5]:
# Create multinomial naive Bayes object with prior probabilities of each class
clf = MultinomialNB(class_prior=[0.25, 0.5])
# Train model
model = clf.fit(X, y)
In [6]:
# Create new observation
new_observation = [[0, 0, 0, 1, 0, 1, 0]]
In [7]:
# Predict new observation's class
model.predict(new_observation)
Out[7]: