Title: Convert A Categorical Variable Into Dummy Variables
Slug: pandas_convert_categorical_to_dummies
Summary: Convert A Categorical Variable Into Dummy Variables
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon


In [1]:
# import modules
import pandas as pd

In [2]:
# Create a dataframe
raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
        'sex': ['male', 'female', 'male', 'female', 'female']}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'sex'])
df


Out[2]:
first_name last_name sex
0 Jason Miller male
1 Molly Jacobson female
2 Tina Ali male
3 Jake Milner female
4 Amy Cooze female

In [3]:
# Create a set of dummy variables from the sex variable
df_sex = pd.get_dummies(df['sex'])

In [4]:
# Join the dummy variables to the main dataframe
df_new = pd.concat([df, df_sex], axis=1)
df_new


Out[4]:
first_name last_name sex female male
0 Jason Miller male 0.0 1.0
1 Molly Jacobson female 1.0 0.0
2 Tina Ali male 0.0 1.0
3 Jake Milner female 1.0 0.0
4 Amy Cooze female 1.0 0.0

In [5]:
# Alterative for joining the new columns
df_new = df.join(df_sex)
df_new


Out[5]:
first_name last_name sex female male
0 Jason Miller male 0.0 1.0
1 Molly Jacobson female 1.0 0.0
2 Tina Ali male 0.0 1.0
3 Jake Milner female 1.0 0.0
4 Amy Cooze female 1.0 0.0