Title: Selecting Pandas DataFrame Rows Based On Conditions
Slug: pandas_selecting_rows_on_conditions
Summary: Selecting Pandas DataFrame Rows Based On Conditions
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon
In [38]:
# Import modules
import pandas as pd
import numpy as np
In [39]:
# Create a dataframe
raw_data = {'first_name': ['Jason', 'Molly', np.nan, np.nan, np.nan],
'nationality': ['USA', 'USA', 'France', 'UK', 'UK'],
'age': [42, 52, 36, 24, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'nationality', 'age'])
df
Out[39]:
In [40]:
# Create variable with TRUE if nationality is USA
american = df['nationality'] == "USA"
# Create variable with TRUE if age is greater than 50
elderly = df['age'] > 50
# Select all casess where nationality is USA and age is greater than 50
df[american & elderly]
Out[40]:
In [41]:
# Select all cases where the first name is not missing and nationality is USA
df[df['first_name'].notnull() & (df['nationality'] == "USA")]
Out[41]: