In [1]:

    
import numpy as np
import pandas as pd

Why?

From 1993 - 2010, the U.S. Census Bureau published the Consolidated Federal Funds Report, a dataset that contained total federal spending and how money flowed to each state, county, and congressional district.

When they stopped, the country was left with no information about the federal dollars in the states.

Federal Dollars in States

Money directly to individuals
- Social Security
- Medicare
- Food Stamps (SNAP)
- etc.
Money to state and local governments
- Transportation
- General welfare programs (TANF, WIC)
- Medicaid
Federal contracts performed in a state
Salaries/benefits of a state's federal workers

1. Money to Individuals

source: Bureau of Economic Analysis, Annual State Personal Income and Employment, Personal Current Transfer Receipts (SA35)
Totals include the following LineCodes from the 2000 Current transfer receipts of individuals from governments series:

(list the codes here)



In [2]:

    
#money to individuals
indv_cols = ['year', 'state_id', 'federal', 'total']
indv = pd.read_csv(
    'data/individualreceiptgovtstate.txt',
    usecols=indv_cols,
    index_col=['year','state_id']
    )
indv = indv.rename(columns = {
    'total' : 'total_indv_receipt',
    'federal' : 'fed_indv_receipt'
})
indv.head()









    Out[2]:






  
    
      
      
      total_indv_receipt
      fed_indv_receipt
    
    
      year
      state_id
      
      
    
  
  
    
      2002
      1
        19682018000
       15313883000
    
    
      2
         3189286000
        2257451000
    
    
      3
        20486993000
       15221625000
    
    
      4
        12298894000
        9380832000
    
    
      5
       139072970000
       93878242000

2. Money to State and Local Governments

source: Census State and Local Government Finances
state revenue totals = general revenue (i.e., exclude liquor, utility, and insurance trust revenues)



In [3]:

    
#money to state and local governments
govt_cols = ['year', 'state_id', 'level', 'general_rev', 'fed_rev']
govt = pd.read_csv(
    'data/revenuegeneralstate.txt',
    usecols = govt_cols,
    index_col=['year','state_id']
)
#we only want the subset of data where level = 1 (combination of state and local governments)
govt = govt[govt['level']==1][['general_rev', 'fed_rev']]
govt = govt.rename(columns = {
    'general_rev' : 'total_rev'
})
govt.head()









    Out[3]:






  
    
      
      
      total_rev
      fed_rev
    
    
      year
      state_id
      
      
    
  
  
    
      2002
      1
        23113713000
        6263380000
    
    
      2
         7208865000
        1789266000
    
    
      3
        25523515000
        5718580000
    
    
      4
        13209552000
        3645274000
    
    
      5
       226639222000
       48249715000

3. Federal Contracts

source: USASpending.gov
primary awards (no sub-contracts) by place of performance state



In [4]:

    
#federal contracts
contract_cols = ['year', 'state_id', 'total_amount', 'dod_amount']
contracts = pd.read_csv(
    'data/federalcontractstate.txt',
    usecols = contract_cols,
    index_col=['year','state_id']
)
contracts = contracts.rename(columns= {
    'total_amount' : 'fed_contracts',
    'dod_amount' : 'dod_contracts'
})
contracts.head()









    Out[4]:






  
    
      
      
      fed_contracts
      dod_contracts
    
    
      year
      state_id
      
      
    
  
  
    
      2011
      1
       1.138446e+10
       9.759450e+09
    
    
      2012
      1
       1.073141e+10
       9.095490e+09
    
    
      2009
      1
       1.109927e+10
       9.531534e+09
    
    
      2008
      1
       1.000130e+10
       8.548404e+09
    
    
      2010
      1
       9.966980e+09
       8.157725e+09

4. Federal Compensation

source: Bureau of Economic Analysis
Annual State Personal Income and Employment, 2002-2012
Line 2001 = Federal civilian employees
Line 2002 = Military employees



In [5]:

    
#federal employee salaries/benefits
emp_cols = ['year', 'state_id', 'emp_fed_total', 'comp_fed_total', 'emp_total', 'comp_total']
emp = pd.read_csv(
    'data/federaljobstate.txt',
    usecols = emp_cols,
    index_col=['year','state_id']
)
emp = emp.rename(columns = {
    'emp_total' : 'total_emp',
    'comp_total' : 'total_comp',
    'emp_fed_total' : 'fed_emp',
    'comp_fed_total' : 'fed_comp'
})
emp.head()









    Out[5]:






  
    
      
      
      total_emp
      total_comp
      fed_emp
      fed_comp
    
    
      year
      state_id
      
      
      
      
    
  
  
    
      2002
      62
       136364000
       6135061000000
       4911000
       276494000000
    
    
      2003
      62
       136072000
       6356044000000
       4933000
       304410000000
    
    
      2004
      62
       137601000
       6730345000000
       4880000
       323937000000
    
    
      2005
      62
       139559000
       7078761000000
       4816000
       343384000000
    
    
      2006
      62
       141908000
       7491672000000
       4825000
       355886000000

Create the grand totals

federal receipts to individuals + federal money to state/local governments + contract $ + federal compensation



In [6]:

    
#stick everything together
all = (indv 
    .join(govt)
    .join(contracts)
    .join(emp))
all.reset_index(inplace=True)
#get rid of years < 2004
#all = all[all['year']>2003]
all.head()









    Out[6]:






  
    
      
      year
      state_id
      total_indv_receipt
      fed_indv_receipt
      total_rev
      fed_rev
      fed_contracts
      dod_contracts
      total_emp
      total_comp
      fed_emp
      fed_comp
    
  
  
    
      0
       2002
       1
        19682018000
       15313883000
       2.311371e+10
        6263380000
      NaN
      NaN
        1973403
        74291247000
        88400
        4907355000
    
    
      1
       2002
       2
         3189286000
        2257451000
       7.208865e+09
        1789266000
      NaN
      NaN
         319714
        15774610000
        39647
        2275956000
    
    
      2
       2002
       3
        20486993000
       15221625000
       2.552352e+10
        5718580000
      NaN
      NaN
        2356727
        97906942000
        82234
        4530946000
    
    
      3
       2002
       4
        12298894000
        9380832000
       1.320955e+10
        3645274000
      NaN
      NaN
        1203665
        41601200000
        39789
        1831098000
    
    
      4
       2002
       5
       139072970000
       93878242000
       2.266392e+11
       48249715000
      NaN
      NaN
       15642910
       794904524000
       483628
       27210827000



In [7]:

    
#total federal dollars in state = 
# federal receipts to individuals + 
# federal money to state/local governments + 
# contract $ + 
# federal compensation

all['total'] = (
    all['total_indv_receipt'] + 
    all['fed_rev'] + 
    all['fed_contracts'] +
    all['fed_comp']
)



In [8]:

    
#merge in state names
states = pd.read_csv(
    'data/state.txt')
all = pd.merge(all, states[['id', 'state_ansi', 'state_abbr', 'state_name']], left_on='state_id', right_on='id')

Inflation-Adjust

Adjust totals to FY 2012 dollars.
Deflator source, Office of Management and Budget:
10.1--Gross Domestic Product and Deflators Used in the Historical Tables: 1940-2019 from the FY 2015 President's Budget Request.



In [9]:

    
#grab the deflators published by OMB as part of the FY 2015 budget request
gdp = pd.read_excel('data/hist10z1.xls', 0, parse_cols=[0,1,2])
gdp.columns = ['year', 'gdp', 'gdp_price_index']
def clean_year(x):
    try:
        return int(x[:4])
    except:
        return np.nan
gdp['year'] = gdp['year'].map(lambda x: clean_year(x))
gdp = gdp.dropna(subset=['year'])
gdp['year'] = gdp['year'].astype(np.int64)
#adjust deflators to base year 2012
denom = gdp[gdp['year']==2012]['gdp_price_index'].values[0]
gdp['gdp_price_index'] = gdp['gdp_price_index']/denom



In [10]:

    
#merge deflators into main data and inflation-adjust
all = pd.merge(all, gdp[['year','gdp_price_index']], left_on='year', right_on='year')
all['total_adj'] = all['total']/all['gdp_price_index']

Per-Capita

Get population estimates for per-capita numbers
source: Census Population Estimates, State Characteristics, Vintage 2013



In [11]:

    
#population
pop = pd.read_csv(
    'data/populationstate.txt'
    )
pop = pop.rename(columns = {'value' : 'pop'})
all = pd.merge(all, pop, left_on=['year','state_id'], right_on=['year','state_id'])

Visualize

Note: need v0.4.0 (Sept. 2014) version of Seaborn to fix plotting issue when series has null values



In [12]:

    
%matplotlib inline
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt

#tweak dataset format for Seaborn time series plotting
viz2 = all
viz2 = viz2[['state_abbr', 'year', 'fed_indv_receipt', 'fed_contracts', 'fed_comp', 'fed_rev', 'total', 'gdp_price_index', 'pop']]
viz2['Individual Aid'] = viz2['fed_indv_receipt'] / viz2['gdp_price_index']
viz2['Per-Person Individual Aid'] = viz2['Individual Aid'] / viz2['pop']
viz2['Contracts'] = viz2['fed_contracts'] / viz2['gdp_price_index']
viz2['Per-Person Contracts'] = viz2['Contracts'] / viz2['pop']
viz2['State Revenue'] = viz2['fed_rev'] / viz2['gdp_price_index']
viz2['Per-Person State Revenue'] = viz2['State Revenue'] / viz2['pop']
viz2['Compensation'] = viz2['fed_comp'] / viz2['gdp_price_index'] 
viz2['Per-Person Compensation'] = viz2['Compensation'] / viz2['pop']
viz2['Total'] = viz2['total'] / viz2['gdp_price_index'] 
viz2['Per-Person Total'] = viz2['Total'] / viz2['pop']
viz2 = viz2.rename(columns = {'state_abbr' : 'State', 'year' : 'Year'})
spendtype = viz2[['State', 'Year', 'Individual Aid','Contracts', 'State Revenue', 'Compensation']]
spendtype_pc = viz2[['State', 'Year', 'Per-Person Individual Aid','Per-Person Contracts', 'Per-Person State Revenue', 'Per-Person Compensation']]

spendtype = pd.melt(spendtype, id_vars=['State', 'Year'], var_name = 'Fed $ Type', value_name = 'Amount')
spendtype_pc = pd.melt(spendtype_pc, id_vars=['State', 'Year'], var_name = 'Fed $ Type', value_name = 'Amount')
spendtype_pc.head()









    Out[12]:






  
    
      
      State
      Year
      Fed $ Type
      Amount
    
  
  
    
      0
       AL
       2002
       Per-Person Individual Aid
       4218.937
    
    
      1
       AK
       2002
       Per-Person Individual Aid
       4337.702
    
    
      2
       AZ
       2002
       Per-Person Individual Aid
       3481.552
    
    
      3
       AR
       2002
       Per-Person Individual Aid
       4278.875
    
    
      4
       CA
       2002
       Per-Person Individual Aid
       3322.725

Prep Data



In [13]:

    
#tweak dataset format for Seaborn time series plotting
viz2 = all
viz2 = viz2[['state_abbr', 'year', 'fed_indv_receipt', 'fed_contracts', 'fed_comp', 'fed_rev', 'total', 'gdp_price_index', 'pop']]
viz2['Individual Aid'] = viz2['fed_indv_receipt'] / viz2['gdp_price_index']
viz2['Per-Person Individual Aid'] = viz2['Individual Aid'] / viz2['pop']
viz2['Contracts'] = viz2['fed_contracts'] / viz2['gdp_price_index']
viz2['Per-Person Contracts'] = viz2['Contracts'] / viz2['pop']
viz2['State Revenue'] = viz2['fed_rev'] / viz2['gdp_price_index']
viz2['Per-Person State Revenue'] = viz2['State Revenue'] / viz2['pop']
viz2['Compensation'] = viz2['fed_comp'] / viz2['gdp_price_index'] 
viz2['Per-Person Compensation'] = viz2['Compensation'] / viz2['pop']
viz2['Total'] = viz2['total'] / viz2['gdp_price_index'] 
viz2['Per-Person Total'] = viz2['Total'] / viz2['pop']
viz2 = viz2.rename(columns = {'state_abbr' : 'State', 'year' : 'Year'})

spendtype = viz2[['State', 'Year', 'Individual Aid','Contracts', 'State Revenue', 'Compensation']]
spendtype = pd.melt(spendtype, id_vars=['State', 'Year'], var_name = 'Fed $ Type', value_name = 'Amount')

spendtype_pc = viz2[['State', 'Year', 'Per-Person Individual Aid','Per-Person Contracts', 'Per-Person State Revenue', 'Per-Person Compensation']]
spendtype_pc = pd.melt(spendtype_pc, id_vars=['State', 'Year'], var_name = 'Fed $ Type', value_name = 'Amount')
spendtype_pc.head()









    Out[13]:






  
    
      
      State
      Year
      Fed $ Type
      Amount
    
  
  
    
      0
       AL
       2002
       Per-Person Individual Aid
       4218.937
    
    
      1
       AK
       2002
       Per-Person Individual Aid
       4337.702
    
    
      2
       AZ
       2002
       Per-Person Individual Aid
       3481.552
    
    
      3
       AR
       2002
       Per-Person Individual Aid
       4278.875
    
    
      4
       CA
       2002
       Per-Person Individual Aid
       3322.725

Federal Per-Capita Spending, U.S.

Federal Per-Capita $ Trends, by State



In [21]:

    
sns.factorplot(
    'Year', 'Amount', 'Fed $ Type',
    spendtype_pc[spendtype_pc['State']=='US'],kind='point',size=5
    )









    Out[21]:





<seaborn.axisgrid.FacetGrid at 0x28379550>



In [15]:

    
#plot per-capita federal spending (by type) in the U.S.
#exclude Washington, DC because its federal compensation numbers skew so high
sns.set(style="white")
flatui = ["#9b59b6", "#3498db", "#95a5a6", "#e74c3c", "#34495e", "#2ecc71"]
sns.factorplot(
    'Year', 'Amount', 'Fed $ Type',
    spendtype_pc[spendtype_pc['State']<>'DC'],
    col='State', col_wrap=4, kind='point',
    palette=sns.color_palette(flatui)
    )









    Out[15]:





<seaborn.axisgrid.FacetGrid at 0x17612a90>

Compare Each State's Per-Capita Fed. $ To U.S. Per-Capita



In [16]:

    
stateus_pc = viz2[['State', 'Year', 'Per-Person Total']]
#limit DataFrame to years where we have data for all 4 ways that federal $ go to states
#get rid of DC
stateus_pc = stateus_pc[(stateus_pc['Year'].isin([2007, 2008, 2009, 2010, 2011])) & (~stateus_pc['State'].isin(['DC']))]
us = stateus_pc[stateus_pc['State']=='US']
us = us.rename(columns={'Per-Person Total' : 'US Per-Person Total'})
stateus_pc = stateus_pc[stateus_pc['State']<>'US']
stateus_pc = pd.merge(
    stateus_pc, us,
    left_on=['Year'], right_on=['Year'],
    suffixes=('', '_US'))
stateus_pc = stateus_pc[['State', 'Year', 'Per-Person Total', 'US Per-Person Total']]
stateus_pc = pd.melt(
    stateus_pc,
    id_vars=['Year', 'State'],
    var_name = 'Type', value_name = 'Amount')
sns.set(style="white")
sns.factorplot(
    'Year', 'Amount', 'Type',
    stateus_pc,
    col='State', col_wrap=4, kind='bar'
    )









    Out[16]:





<seaborn.axisgrid.FacetGrid at 0x213154e0>

Federal Contracts



In [17]:

    
contracts = pd.read_csv('data/fed_contract_totals.txt')
contracts.head()









    Out[17]:






  
    
      
      state_abbr
      year
      total
      total_adj
      dod
      dod_adj
      non-dod
      non-dod_adj
      total_per_capita
      total_per_capita_adj
      dod_per_capita
      dod_per_capita_adj
      other_per_capita
      other_per_capita_adj
      dod_percent_total
      other_percent_total
      update_date
    
  
  
    
      0
       AK
       2012
       2.449410e+09
       2.566977e+09
       2.021896e+09
       2.118944e+09
       4.275133e+08
       4.480333e+08
       3353.95
       3514.933976
       2768.56
       2901.446238
       585.39
       613.487738
       82.55
       17.45
       9/5/2014 17:30:30.164745
    
    
      1
       AK
       2013
       1.625632e+09
       1.678331e+09
       1.279484e+09
       1.320963e+09
       3.461474e+08
       3.573687e+08
       2211.35
       2283.037374
       1740.48
       1796.902746
       470.86
       486.124303
       78.71
       21.29
       9/5/2014 17:30:30.171559
    
    
      2
       AK
       2011
       2.262005e+09
       2.412034e+09
       1.821573e+09
       1.942390e+09
       4.404323e+08
       4.696442e+08
       3127.02
       3334.420985
       2518.16
       2685.178076
       608.86
       649.242909
       80.53
       19.47
       9/5/2014 17:30:30.178061
    
    
      3
       AK
       2007
       2.360058e+09
       2.672470e+09
       1.876305e+09
       2.124680e+09
       4.837532e+08
       5.477898e+08
       3469.14
       3928.365983
       2758.06
       3123.157060
       711.09
       805.220247
       79.50
       20.50
       9/5/2014 17:30:30.184181
    
    
      4
       AK
       2009
       2.814503e+09
       3.086416e+09
       2.194235e+09
       2.406223e+09
       6.202682e+08
       6.801932e+08
       4027.08
       4416.142121
       3139.58
       3442.899441
       887.50
       973.242680
       77.96
       22.04
       9/5/2014 17:30:30.190148



In [ ]:

    
contracts = contracts[~contracts['state_abbr'].isin(['DC', 'PR', 'US'])]
g = sns.FacetGrid(contracts, col='state_abbr', col_wrap=4)
g.map(sns.pointplot, 'year', 'total_adj')
g.set_axis_labels("Year", "Per-Capita Fed. Contract $")

		total_indv_receipt	fed_indv_receipt
year	state_id
2002	1	19682018000	15313883000
	2	3189286000	2257451000
	3	20486993000	15221625000
	4	12298894000	9380832000
	5	139072970000	93878242000

		fed_contracts	dod_contracts
year	state_id
2011	1	1.138446e+10	9.759450e+09
2012	1	1.073141e+10	9.095490e+09
2009	1	1.109927e+10	9.531534e+09
2008	1	1.000130e+10	8.548404e+09
2010	1	9.966980e+09	8.157725e+09

	State	Year	Fed $ Type	Amount
0	AL	2002	Per-Person Individual Aid	4218.937
1	AK	2002	Per-Person Individual Aid	4337.702
2	AZ	2002	Per-Person Individual Aid	3481.552
3	AR	2002	Per-Person Individual Aid	4278.875
4	CA	2002	Per-Person Individual Aid	3322.725

	state_abbr	year	total	total_adj	dod	dod_adj	non-dod	non-dod_adj	total_per_capita	total_per_capita_adj	dod_per_capita	dod_per_capita_adj	other_per_capita	other_per_capita_adj	dod_percent_total	other_percent_total	update_date
0	AK	2012	2.449410e+09	2.566977e+09	2.021896e+09	2.118944e+09	4.275133e+08	4.480333e+08	3353.95	3514.933976	2768.56	2901.446238	585.39	613.487738	82.55	17.45	9/5/2014 17:30:30.164745
1	AK	2013	1.625632e+09	1.678331e+09	1.279484e+09	1.320963e+09	3.461474e+08	3.573687e+08	2211.35	2283.037374	1740.48	1796.902746	470.86	486.124303	78.71	21.29	9/5/2014 17:30:30.171559
2	AK	2011	2.262005e+09	2.412034e+09	1.821573e+09	1.942390e+09	4.404323e+08	4.696442e+08	3127.02	3334.420985	2518.16	2685.178076	608.86	649.242909	80.53	19.47	9/5/2014 17:30:30.178061
3	AK	2007	2.360058e+09	2.672470e+09	1.876305e+09	2.124680e+09	4.837532e+08	5.477898e+08	3469.14	3928.365983	2758.06	3123.157060	711.09	805.220247	79.50	20.50	9/5/2014 17:30:30.184181
4	AK	2009	2.814503e+09	3.086416e+09	2.194235e+09	2.406223e+09	6.202682e+08	6.801932e+08	4027.08	4416.142121	3139.58	3442.899441	887.50	973.242680	77.96	22.04	9/5/2014 17:30:30.190148