Title: Convert A String Categorical Variable To A Numeric Variable Slug: convert_categorical_to_numeric
Summary: Convert A String Categorical Variable To A Numeric Variable Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon

Originally from: Data Origami.

import modules


In [1]:
import pandas as pd

Create dataframe


In [2]:
raw_data = {'patient': [1, 1, 1, 2, 2], 
        'obs': [1, 2, 3, 1, 2], 
        'treatment': [0, 1, 0, 1, 0],
        'score': ['strong', 'weak', 'normal', 'weak', 'strong']} 
df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])
df


Out[2]:
patient obs treatment score
0 1 1 0 strong
1 1 2 1 weak
2 1 3 0 normal
3 2 1 1 weak
4 2 2 0 strong

Create a function that converts all values of df['score'] into numbers


In [3]:
def score_to_numeric(x):
    if x=='strong':
        return 3
    if x=='normal':
        return 2
    if x=='weak':
        return 1

Apply the function to the score variable


In [4]:
df['score_num'] = df['score'].apply(score_to_numeric)
df


Out[4]:
patient obs treatment score score_num
0 1 1 0 strong 3
1 1 2 1 weak 1
2 1 3 0 normal 2
3 2 1 1 weak 1
4 2 2 0 strong 3