Title: Converting A Dictionary Into A Matrix
Slug: converting_a_dictionary_into_a_matrix Summary: How to convert a dictionary into a feature matrix for machine learning in Python.
Date: 2016-09-06 12:00
Category: Machine Learning
Tags: Preprocessing Structured Data
Authors: Chris Albon

Preliminaries


In [3]:
# Load library
from sklearn.feature_extraction import DictVectorizer

Create Dictionary


In [4]:
# Our dictionary of data
data_dict = [{'Red': 2, 'Blue': 4},
             {'Red': 4, 'Blue': 3},
             {'Red': 1, 'Yellow': 2},
             {'Red': 2, 'Yellow': 2}]

Feature Matrix From Dictionary


In [5]:
# Create DictVectorizer object
dictvectorizer = DictVectorizer(sparse=False)

# Convert dictionary into feature matrix
features = dictvectorizer.fit_transform(data_dict)

# View feature matrix
features


Out[5]:
array([[ 4.,  2.,  0.],
       [ 3.,  4.,  0.],
       [ 0.,  1.,  2.],
       [ 0.,  2.,  2.]])

View column names


In [6]:
# View feature matrix column names
dictvectorizer.get_feature_names()


Out[6]:
['Blue', 'Red', 'Yellow']