Title: Imputing Missing Class Labels
Slug: imputing_missing_class_labels
Summary: How to impute missing class labels for machine learning in Python.
Date: 2016-09-06 12:00
Category: Machine Learning
Tags: Preprocessing Structured Data
Authors: Chris Albon
In [1]:
# Load libraries
import numpy as np
from sklearn.preprocessing import Imputer
In [4]:
# Create feature matrix with categorical feature
X = np.array([[0, 2.10, 1.45],
[1, 1.18, 1.33],
[0, 1.22, 1.27],
[0, -0.21, -1.19],
[np.nan, 0.87, 1.31],
[np.nan, -0.67, -0.22]])
In [5]:
# Create Imputer object
imputer = Imputer(strategy='most_frequent', axis=0)
# Fill missing values with most frequent class
imputer.fit_transform(X)
Out[5]: