The Python code below loads all WHISARD violations since 2005 (based on the end-date of the violation period); isolates the violations of laws meant to protect H-2 workers; and provides aggregate counts of the number of employers, certain violations, and workers.
Load all violations, and limit them to those that meet all of the following critera: (a) DATE_END_VIOL_YEAR
is 2005 or later; (b) Classified as having an ACT_ID
of "H2A" or "H2B"; and (c) has an E
(employee) record flag, as opposed to an R
(employer) record flag.
Group all of these violations by their violation "description." Count the number of matching violations for each description.
Identify violations that pertain to U.S. workers, rather than guest workers, and exclude them from the analysis.
Identify violations that pertain to underpaying guest workers.
Calculate the number of workers affected by each set of violations, and the number of employers named (based on the first available of the following: federal EIN, legal name, trade name).
In [1]:
import pandas as pd
import sys
sys.path.append("../utils")
import loaders
Note: loaders
is a custom module to handle most common data-loading operations in these analyses. It is available here.
In [2]:
employers = loaders.load_employers().set_index("CASE_ID")
In [3]:
violations = loaders.load_violations().set_index("CASE_ID")
In [4]:
joined = violations\
.join(employers[[ "ER_EIN", "employer_id" ]])
In [5]:
# Get H-2A and H-2B violations from those cases
h2_employee_violations = joined[
(joined["DATE_END_VIOL_YEAR"] >= 2005) &
(joined["ACT_ID"].isin([ "H2A", "H2B" ])) &
(joined["violation_found"] == True) &
(joined["ER_EE_VIOL"] == "E") # E = "Employee"
]
In [6]:
by_act_and_description = h2_employee_violations.groupby([ "VIOLATION_DESC", "ACT_ID" ])
In [7]:
violation_counts = by_act_and_description\
.size()\
.unstack()\
.fillna(0)\
.sort([ "H2A", "H2B" ], ascending=False)
violation_counts
Out[7]:
In [8]:
non_guestworker_descs = [
"17 Preferential treatment given to H-2A workers",
"02 Unlawful rejection of US workers (2008 & 2010 Rules)",
"Requirement to Hire U.S. Workers - ER failed to properly hire or rehire U.S. workers",
"Layoff- ER improperly laid off similarly employed U.S. workers within 120 days of date of need, unless employee refused or was lawfully rejected",
"Job Opportunity - (U.S. workers) - ER failed to offer U.S. workers bona fide, full-time temp. position due to inequitable qualification requirements",
"Terms and Working Conditions for U.S. Workers - ER failed to offer terms and working conditions as required"
]
In [9]:
guestworker_wage_viols = [
"27 Failed to pay required rate(s) of pay (2008 & 2010 Rules)",
"05 Failed to pay proper rate",
"07 Illegal deductions",
"28 Unlawful deductions (2008 & 2010 Rules)",
"06 Failed to pay 3/4 guarantee",
"09 Illegal charges for housing",
"19 Failed to comply - 3/4-guarantee req (2008 & 2010 Rules)",
"09 Unlawful charges for public housing (2008 & 2010 Rules)",
"Offered Wage- failed to pay the offered wage rate which equals or exceeds the highest of the prevailing wage, Federal, State, or local minimum wage",
"Wages - Prohibited Fees - ER sought or required workers to pay prohibited fees or expenses related to the TEC (petition/agent/attorney/recruitment)",
"Impermissible Deductions - ER failed to specify deductions from pay.",
"Incentive Wage - offered wage based on incentives failed to equal or exceed highest of the PW/Fed./State/local MW on a weekly/bi-weekly/monthly basis.",
"Back Wages due - failure to offer worker bona fide, full-time temporary position comparable to U.S. workers similarly employed Attestation 1"
]
In [10]:
# Make sure that we've correctly transcribed the violation descriptions
assert((violation_counts.ix[non_guestworker_descs].sum(axis=1) > 0).mean() == 1)
assert((violation_counts.ix[guestworker_wage_viols].sum(axis=1) > 0).mean() == 1)
In [11]:
h2_guestworker_violations = h2_employee_violations[
~h2_employee_violations["VIOLATION_DESC"].isin(non_guestworker_descs)
]
In [12]:
h2_guestworker_wage_violations = h2_employee_violations[
h2_employee_violations["VIOLATION_DESC"].isin(guestworker_wage_viols)
]
In [13]:
h2_guestworker_violations["employer_id"].nunique()
Out[13]:
In [14]:
h2_guestworker_violations[
(h2_guestworker_violations["employer_id"] == h2_guestworker_violations["ER_EIN"])
]["ER_EIN"].nunique()
Out[14]:
Note: The first count above uses employers' legal or trade names if their case data does not include an EIN. The second count includes only employers with EINs.
In [15]:
h2_guestworker_violations["CASE_EER_ID"].nunique()
Out[15]:
Note: Individual workers are uniquely identified on a per-case basis, but are not tracked across cases or employers.
In [16]:
h2_guestworker_wage_violations["CASE_EER_ID"].nunique()
Out[16]:
Note: Includes "illegal deductions" and permutations thereof. Excludes "failed for pay for inbound/outbound transportation" and permutations thereof. Also excludes "failed to pay wages when due."