Read these carefully
In [1]:
# Run the following to import necessary packages and import dataset
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
datafile = "dataset/funding.csv"
df = pd.read_csv(datafile)
df.drop('Dummy', axis=1, inplace=True)
df.head(n=5) # Print n number of rows from top of dataset
Out[1]:
In [9]:
ls = df['Age'].tolist()
In [20]:
df.describe(include='all')
Out[20]:
In [24]:
# Example dataframe query showing there is no discrimination by gender.
df.groupby(['Gender'], sort=True).agg({'Expenditures': [np.mean]})
Out[24]:
Analyze the data set and determine whether or not discrimination among Hispanic and White but not Hispanic groups exists by examining the Expenditures
. Feel free to use the dataframes defined in the cell below.
In [21]:
w = "White not Hispanic"
h = "Hispanic"
is_hispanic = df['Ethnicity'] == h
is_white = df['Ethnicity'] == w
df1 = df[is_hispanic | is_white] # filters by two ethnicity groups
dfh = df[is_hispanic]
dfw = df[is_white]
df1.head(5)
Out[21]:
In [38]:
df1.groupby(['Ethnicity', 'Age']).agg({'Expenditures': [np.mean]})
Out[38]:
In [ ]:
# Write your query below and set `df_answer' to the dataframe
df_answer = None
print(df_answer)
After analyzing this dataset, was there discrimination in the expenditures across different ethnicities?
In [ ]:
# Write answer below by setting discrimination to True or False
discrimination = None
If this clue changes your answer, try again below. Otherwise, if you are confident in your answer above, leave the following untouched.
In [ ]:
df_answer_clue = None
print(df_answer_clue)
In [ ]:
discrimination_clue = None