Examining Racial Discrimination in the US Job Market

Background

Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

Data

In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.


In [1]:
# Question 1) What test is appropriate for this problem? Does CLT apply?

#    A hypothesis test and/or confidence intervals would be appropriate since a) we are comparing the 
#    difference between two population groups: black or white sounding names who received callbacks and 
#    b) the sample size is large enough - total number of data points is greater than 30 (n > 30).

#    The Central Limit Theorem, can be used here since it basically measures the distribution
#    of the sample mean or sample proportionas as long as the sample size is at least 30. In this way, 
#    one can analyze the normal distribution of the sample.

# Question 2) What are the null and alternate hypotheses?

#    Null Hypothesis : Ho:μ1=μ2
#            That there is no significant difference between the mean of calls received from CVs with
#            white-sounding names compared to the mean of those with black-sounding names.
             
#    Alternative Hypothesis : Ho:μ1≠μ2
#            That there is a significant difference between the mean of calls received from CVs with
#            white-sounding names compared to the mean of those with black-sounding names.

In [2]:
import pandas as pd
import numpy as np
from scipy import stats

data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

# Question 3) Compute margin of error, confidence interval, and p-value:

# Compute margin of error using this procedure:
# 1) Find the sample size n and the sample proportion p-hat
# 2) Multiply the sample proportion by 1 - p-hat
# 3) Divide the result by n
# 4) Take the square root of the calculated value which gives standard error
# 5) Multiply the result by the appropriate z*-value for the confidence level desired.
#    In this case, I will use a 95% confidence level which translates to a critical value of 1.96.

# First, get the two groups of proportion samples.
data_black_sounding = data[data.race=='b']
data_white_sounding = data[data.race=='w']
phat_b = (sum(data[data.race=='b'].call)/len(data_black_sounding.call))
phat_w = (sum(data[data.race=='w'].call)/len(data_white_sounding.call))

# Calculate the standard error
SE = np.sqrt((phat_b*(1 - phat_b)/(len(data_black_sounding))) + (phat_w*(1 - phat_w) /(len(data_white_sounding))))

# Calculate the margin of error using a 95% confidence level.
crit_val = 1.96
MoE = abs(crit_val * SE)
print("The sample population proportion has a +/- %0.4F margin of error" % MoE)

# Calculate the confidence interval. In this instance, I will have two results:
# each sample proportion +/- the margin of error.              
phat_b_CI_plus = phat_b + MoE
phat_b_CI_minus = phat_b - MoE
phat_w_CI_plus = phat_w + MoE
phat_w_CI_minus = phat_w - MoE

print("The proportion of CVs with black-sounding names that receive a callback is between %0.4F and %0.4F " % (phat_b_CI_plus, phat_b_CI_minus))
print("The proportion of CVs with white-sounding names that receive a callback is between %0.4F and %0.4F " % (phat_w_CI_plus, phat_w_CI_minus))

# Calculate an appropriate test statistic and the p-value.
# Since we are dealing with proportions, use the following steps:
#    1) Use the z-test statistic formula
#    2) Determine the p-value associated with the test statistic
#    3) Decide between null and alternative hypothesis
              
# Null Hypothesis expects no difference between the two proportions: phat_b - phat_w
null = 0

# Need a total proportion sample
p_total = (sum(data.call)/(len(data.call)))

# First calculate Standard Error of total proportion (again assuming that H0 is true)
p_total_SE = np.sqrt((p_total*(1 - p_total)/(len(data_black_sounding))) + (p_total*(1 - p_total) /(len(data_white_sounding))))

# Now calculate the z-score and get two-sided p-values using stat function
p_sample_proportion = (phat_w - phat_b)
z_score = ((p_sample_proportion - null)/p_total_SE) 
p_values = stats.norm.sf(abs(z_score))*2
print("Z-score is equal to : %5.4F  p-value is equal to: %5.8F" % (z_score, p_values))


The sample population proportion has a +/- 0.0153 margin of error
The proportion of CVs with black-sounding names that receive a callback is between 0.0797 and 0.0492 
The proportion of CVs with white-sounding names that receive a callback is between 0.1118 and 0.0813 
Z-score is equal to : 4.1084  p-value is equal to: 0.00003984

In [3]:
# Question 4) Write a story describing the statistical significance in the context of the 
#             original problem

#    The p-value of 0.0000398 < 0.015 allows us to reject the null hypothesis and conclude that 
#    there is a significant difference between the two proportions of black-sounding and 
#    white-sounding names with callbacks. 

# Question 5) Does your analysis mean that race/name is the most important factor in callback
#             success? Why or why not? If not, how would you amend your analysis?

#    This initial analysis do indicate that there is some statistically significant difference 
#    between the two proportion samples as divided by race.  However, I believe that further 
#    analysis need to be done to know if race is the most important factor.