Title: Expand Cells Containing Lists Into Their Own Variables In Pandas
Slug: pandas_expand_cells_containing_lists
Summary: Expand Cells Containing Lists Into Their Own Variables In Pandas
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon


In [1]:
# import pandas
import pandas as pd

In [2]:
# create a dataset
raw_data = {'score': [1,2,3], 
        'tags': [['apple','pear','guava'],['truck','car','plane'],['cat','dog','mouse']]}
df = pd.DataFrame(raw_data, columns = ['score', 'tags'])

# view the dataset
df


Out[2]:
score tags
0 1 [apple, pear, guava]
1 2 [truck, car, plane]
2 3 [cat, dog, mouse]

In [3]:
# expand df.tags into its own dataframe
tags = df['tags'].apply(pd.Series)

# rename each variable is tags
tags = tags.rename(columns = lambda x : 'tag_' + str(x))

# view the tags dataframe
tags


Out[3]:
tag_0 tag_1 tag_2
0 apple pear guava
1 truck car plane
2 cat dog mouse

In [4]:
# join the tags dataframe back to the original dataframe
pd.concat([df[:], tags[:]], axis=1)


Out[4]:
score tags tag_0 tag_1 tag_2
0 1 [apple, pear, guava] apple pear guava
1 2 [truck, car, plane] truck car plane
2 3 [cat, dog, mouse] cat dog mouse