Title: Stemming Words
Slug: stemming_words
Summary: How to stem words in unstructured text data for machine learning in Python.
Date: 2016-09-09 12:00
Category: Machine Learning
Tags: Preprocessing Text
Authors: Chris Albon
In [1]:
# Load library
from nltk.stem.porter import PorterStemmer
In [2]:
# Create word tokens
tokenized_words = ['i', 'am', 'humbled', 'by', 'this', 'traditional', 'meeting']
In [3]:
# Create stemmer
porter = PorterStemmer()
# Apply stemmer
[porter.stem(word) for word in tokenized_words]
Out[3]: