Title: One-Hot Encode Nominal Categorical Features
Slug: one-hot_encode_nominal_categorical_features
Summary: How to one-hot encode nominal categorical features for machine learning in Python.
Date: 2016-09-06 12:00
Category: Machine Learning
Tags: Preprocessing Structured Data
Authors: Chris Albon
In [2]:
    
# Load libraries
from sklearn.preprocessing import LabelBinarizerr
import numpy as np
import pandas as pd
    
In [3]:
    
# Create NumPy array
x = np.array([['Texas'], 
              ['California'], 
              ['Texas'], 
              ['Delaware'], 
              ['Texas']])
    
In [4]:
    
# Create LabelBinzarizer object
one_hot = LabelBinarizer()
# One-hot encode data
one_hot.fit_transform(x)
    
    Out[4]:
In [5]:
    
# View classes
one_hot.classes_
    
    Out[5]:
In [6]:
    
# Dummy feature
pd.get_dummies(x[:,0])
    
    Out[6]: