Title: Filter pandas Dataframes
Slug: filter_dataframes
Summary: Filter pandas Dataframes
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon

Import modules


In [8]:
import pandas as pd

Create Dataframe


In [9]:
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'year': [2012, 2012, 2013, 2014, 2014], 
        'reports': [4, 24, 31, 2, 3],
        'coverage': [25, 94, 57, 62, 70]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df


Out[9]:
coverage name reports year
Cochice 25 Jason 4 2012
Pima 94 Molly 24 2012
Santa Cruz 57 Tina 31 2013
Maricopa 62 Jake 2 2014
Yuma 70 Amy 3 2014

View Column


In [10]:
df['name']


Out[10]:
Cochice       Jason
Pima          Molly
Santa Cruz     Tina
Maricopa       Jake
Yuma            Amy
Name: name, dtype: object

View Two Columns


In [11]:
df[['name', 'reports']]


Out[11]:
name reports
Cochice Jason 4
Pima Molly 24
Santa Cruz Tina 31
Maricopa Jake 2
Yuma Amy 3

View First Two Rows


In [12]:
df[:2]


Out[12]:
coverage name reports year
Cochice 25 Jason 4 2012
Pima 94 Molly 24 2012

View Rows Where Coverage Is Greater Than 50


In [13]:
df[df['coverage'] > 50]


Out[13]:
coverage name reports year
Pima 94 Molly 24 2012
Santa Cruz 57 Tina 31 2013
Maricopa 62 Jake 2 2014
Yuma 70 Amy 3 2014

View Rows Where Coverage Is Greater Than 50 And Reports Less Than 4


In [14]:
df[(df['coverage']  > 50) & (df['reports'] < 4)]


Out[14]:
coverage name reports year
Maricopa 62 Jake 2 2014
Yuma 70 Amy 3 2014