Title: Grouping Rows In Pandas
Slug: pandas_group_rows_by
Summary: Grouping Rows In Pandas
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon
In [5]:
    
# Import modules
import pandas as pd
    
In [6]:
    
# Example dataframe
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
        'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
        'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
        'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
        'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])
df
    
    Out[6]:
In [7]:
    
# Create a grouping object. In other words, create an object that
# represents that particular grouping. In this case we group
# pre-test scores by the regiment.
regiment_preScore = df['preTestScore'].groupby(df['regiment'])
    
In [8]:
    
# Display the mean value of the each regiment's pre-test score
regiment_preScore.mean()
    
    Out[8]: