REST API DEMO: CENSUS

The US Census Burea provides access to its vast stores of demographic data via their API at https://api.census.gov.

The I in API is the entry point into an application: it's the steering wheel and dashboard for whatever more or less complicated vehicle you're driving. In the case of the Census, the main component of the application is a relational database management system. There are probabably several GUIs designed for human readers; the Census API is meant for communication between your software and their application.

In a REST API, the already universal system for transferring data over the internet between applications (a web server and your browser) called http is half of the interface. From there we just need documentation for how to construct the URL in a standards compliant way.

In this example, we explore the American Community Survey 5-Year Data (2009-2015), or ACS5, API. It's documented here:

https://www.census.gov/data/developers/data-sets/acs-5year.html

The ACS5 API has 5 services: Detail Tables, Subject Tables, Data Profile, and Comparison Profile. We'll examine the service for the Detail Tables. Navigating the documentation to discover how exactly to implement an API varies widely, and often requires a bit of trial and error.

First, we should understand what this service does. This is mentioned on the API page:

  • Detail tables contain the most detailed cross-tabulations, many of which are published down to block groups. The data are population counts. There are over 64,000 variables in this dataset.

So, we can use this service to extract ACS data down to a block group level. To see an example of the output, click on the Example Call link: https://api.census.gov/data/2015/acs/acs5?get=NAME,B01001_001E&for=state:\*. You see that the response lists each state and two variables associated with that state, and that the response is in JSON format.

Now, open up the Examples and Supported Geography link (which actually provides documentation for all 4 services in the API). The first row in the table applies to our service.

Click on the examples link. I find it most effective to implement a few examples that work, and then explore how changing values affect the results.

https://api.census.gov/data/2015/acs5?get=NAME,AIANHH&for=county&in=state:24#irrelevant

Section Description
https:// scheme
api.census.gov authority, or simply host if there's no user authentication
/data/2015/acs5 path to a service/resource within a hierarchy
? beginning of the "query" component of a URL
get=NAME,AIANHH first query parameter
& query parameter separator
for=county second query parameter
& query parameter separator
in=state:* third query parameter
# beginning of the "fragment" component of a URL
irrelevant the fragment is a client side pointer, it isn't even sent to the server

In [7]:
import requests
import pandas as pd

In [4]:
path = 'https://api.census.gov/data/2015/acs5'
query = {'get':'NAME,AIANHH', 'for':'county', 'in':'state:24'}
response = requests.get(path, params=query)

In [10]:
print(response.content)


[["NAME","AIANHH","state","county"],
["Allegany County, Maryland",null,"24","001"],
["Anne Arundel County, Maryland",null,"24","003"],
["Baltimore County, Maryland",null,"24","005"],
["Calvert County, Maryland",null,"24","009"],
["Caroline County, Maryland",null,"24","011"],
["Carroll County, Maryland",null,"24","013"],
["Cecil County, Maryland",null,"24","015"],
["Charles County, Maryland",null,"24","017"],
["Dorchester County, Maryland",null,"24","019"],
["Frederick County, Maryland",null,"24","021"],
["Garrett County, Maryland",null,"24","023"],
["Harford County, Maryland",null,"24","025"],
["Howard County, Maryland",null,"24","027"],
["Kent County, Maryland",null,"24","029"],
["Montgomery County, Maryland",null,"24","031"],
["Prince George's County, Maryland",null,"24","033"],
["Queen Anne's County, Maryland",null,"24","035"],
["St. Mary's County, Maryland",null,"24","037"],
["Somerset County, Maryland",null,"24","039"],
["Talbot County, Maryland",null,"24","041"],
["Washington County, Maryland",null,"24","043"],
["Wicomico County, Maryland",null,"24","045"],
["Worcester County, Maryland",null,"24","047"],
["Baltimore city, Maryland",null,"24","510"]]

In [13]:
df = pd.read_json(response.text)
df.head()


Out[13]:
0 1 2 3
0 NAME AIANHH state county
1 Allegany County, Maryland None 24 001
2 Anne Arundel County, Maryland None 24 003
3 Baltimore County, Maryland None 24 005
4 Calvert County, Maryland None 24 009

In [14]:
df.columns = df.iloc[0]
df.drop(0,axis="rows",inplace=True)
df.head()


Out[14]:
NAME AIANHH state county
1 Allegany County, Maryland None 24 001
2 Anne Arundel County, Maryland None 24 003
3 Baltimore County, Maryland None 24 005
4 Calvert County, Maryland None 24 009
5 Caroline County, Maryland None 24 011