Every year, the US Census Bureau releases new estimates of the population of every metropolitan area, county, city and town in the US. They are estimates because they only do the headcount census every 10 years. Between then, they use data and modeling to estimate what the population is. Every 10 years, they recalibrate their models based on how close they came to getting it right, given the headcount census.
Today, we're going to simulate being in a newsroom on the day these new data are released. We're going to look at how a local news organization handled it, and we're going to show how a little bit of R and ggplot knowhow can make this better, easier and pushbutton quick next year.
First, let's talk about how a local newspaper covered it. What did they choose to focus on? What numerical measures did they use? Were they the right ones? Were they useful? Did they use any visuals? What could they have done differently?
Now let's take our own crack at this. You are now on deadline. You have until the end of class to create a visual story out of this data, looking at the state of Nebraska. You will need to:
Some suggestions: Fastest growing? Fastest shrinking? Gainers to losers? One-year change vs since 2010? Every county in a lattice chart? Urban vs rural? Counties that have lost population every year this decade? Gained?
Pair up, plan what you are going to do, and get started. To help you, here's some boilerplate code to get you going. NOTE THE read.csv
BITS. IT'S PULLING THE DATA STRAIGHT FROM THE URL.
In [1]:
library(dplyr)
library(ggplot2)
In [2]:
counties <- read.csv(url("https://www2.census.gov/programs-surveys/popest/datasets/2010-2017/counties/totals/co-est2017-alldata.csv"))
In [3]:
head(counties)
In [4]:
colnames(counties)
Here's some code to filter out just Nebraska counties, remove the statewide total number and calculate percent change into a field called change.
In [5]:
nebraska <- counties %>%
filter(STNAME == "Nebraska") %>%
filter(SUMLEV == 50) %>%
mutate(change = ((POPESTIMATE2017-POPESTIMATE2016)/POPESTIMATE2016)*100)
In [ ]: