Read these carefully
In [1]:
# Run the following to import necessary packages and import dataset
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
datafile = "dataset/funding.csv"
df = pd.read_csv(datafile)
df.drop('Dummy', axis=1, inplace=True)
df.head(n=5) # Print n number of rows from top of dataset
Out[1]:
In [9]:
ls = df['Age'].tolist()
In [20]:
df.describe(include='all')
Out[20]:
In [24]:
# Example dataframe query showing there is no discrimination by gender.
df.groupby(['Gender'], sort=True).agg({'Expenditures': [np.mean]})
Out[24]:
Analyze the data set and determine whether or not discrimination among Hispanic and White but not Hispanic groups exists by examining the Expenditures. Feel free to use the dataframes defined in the cell below.
In [21]:
w = "White not Hispanic"
h = "Hispanic"
is_hispanic = df['Ethnicity'] == h
is_white = df['Ethnicity'] == w
df1 = df[is_hispanic | is_white] # filters by two ethnicity groups
dfh = df[is_hispanic]
dfw = df[is_white]
df1.head(5)
Out[21]:
In [38]:
df1.groupby(['Ethnicity', 'Age']).agg({'Expenditures': [np.mean]})
Out[38]:
In [ ]:
# Write your query below and set `df_answer' to the dataframe
df_answer = None
print(df_answer)
After analyzing this dataset, was there discrimination in the expenditures across different ethnicities?
In [ ]:
# Write answer below by setting discrimination to True or False
discrimination = None
If this clue changes your answer, try again below. Otherwise, if you are confident in your answer above, leave the following untouched.
In [ ]:
df_answer_clue = None
print(df_answer_clue)
In [ ]:
discrimination_clue = None