weightedcalcs
The example below uawa weightedcalcs
to analyze a slice of the American Community Survey's 2015 data for Wyoming.
In [1]:
import weightedcalcs as wc
import pandas as pd
In [2]:
responses = pd.read_csv("../data/acs-2015-pums-wy-simple.csv")
In [3]:
responses.head()
Out[3]:
In addition to the full list of responses, let's create a subset including only adult respondents, since we'll be focusing on income later.
In [4]:
adults = responses[responses["age"] >= 18]
In [5]:
adults.head()
Out[5]:
In [6]:
calc = wc.Calculator("PWGTP")
In [7]:
calc.mean(adults, "income").round()
Out[7]:
In [8]:
calc.std(adults, "income").round()
Out[8]:
In [9]:
calc.median(adults, "income")
Out[9]:
In [10]:
calc.quantile(adults, "income", 0.75)
Out[10]:
~43% of Wyoming residents are married:
In [11]:
calc.distribution(responses, "marriage_status").round(3).sort_values(ascending=False)
Out[11]:
~56% of adult Wyoming residents are married:
In [12]:
calc.distribution(adults, "marriage_status").round(3).sort_values(ascending=False)
Out[12]:
In [13]:
grp_marriage_sex = adults.groupby(["marriage_status", "gender"])
For reference, here's how many responses fall into each category:
In [14]:
grp_marriage_sex.size().unstack()
Out[14]:
In [15]:
calc.mean(grp_marriage_sex, "income").round().astype(int)
Out[15]:
In [16]:
calc.std(grp_marriage_sex, "income").round()
Out[16]:
In [17]:
calc.median(grp_marriage_sex, "income")
Out[17]:
In [18]:
calc.quantile(grp_marriage_sex, "income", 0.75)
Out[18]: