Quandl: ADP National Employment Report

In this notebook, we'll take a look at data set , available on Quantopian. This dataset spans from 2001 through the current day. It contains the value for employment levels as provided by ADP, the payroll service provider. We access this data via the API provided by Quandl. More details on this dataset can be found on Quandl's website.

Blaze

Before we dig into the data, we want to tell you about how you generally access Quantopian partner data sets. These datasets are available using the Blaze library. Blaze provides the Quantopian user with a convenient interface to access very large datasets.

Some of these sets (though not this one) are many millions of records. Bringing that data directly into Quantopian Research directly just is not viable. So Blaze allows us to provide a simple querying interface and shift the burden over to the server side.

To learn more about using Blaze and generally accessing Quantopian partner data, clone this tutorial notebook.

With preamble in place, let's get started:


In [2]:
# import the dataset
from quantopian.interactive.data.quandl import adp_empl_sec 
# Since this data is public domain and provided by Quandl for free, there is no _free version of this
# data set, as found in the premium sets. This import gets you the entirety of this data set.

# import data operations
from odo import odo
# import other libraries we will use
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
adp_empl_sec.sort('asof_date')


Out[3]:
asof_date total_private goods_producing service_providing timestamp
0 2001-04-30 111573.000000 24275.294168 87297.757625 2001-04-30
1 2001-05-31 111398.132821 24140.307995 87257.824826 2001-05-31
2 2001-06-30 111167.552706 23992.631803 87174.920903 2001-06-30
3 2001-07-31 110964.904589 23854.365510 87110.539079 2001-07-31
4 2001-08-31 110719.120440 23699.879744 87019.240696 2001-08-31
5 2001-09-30 110457.629117 23562.811692 86894.817424 2001-09-30
6 2001-10-31 110078.236801 23390.394579 86687.842222 2001-10-31
7 2001-11-30 109716.868489 23206.664911 86510.203579 2001-11-30
8 2001-12-31 109494.189038 23070.799900 86423.389138 2001-12-31
9 2002-01-31 109424.930934 22951.233078 86473.697856 2002-01-31
10 2002-02-28 109334.446257 22861.907749 86472.538508 2002-02-28

The data goes all the way back to 2001 and is updated monthly.

Blaze provides us with the first 10 rows of the data for display. Just to confirm, let's just count the number of rows in the Blaze expression:


In [5]:
adp_empl_sec.count()


Out[5]:
176

Let's go plot it for fun. This data set is definitely small enough to just put right into a Pandas DataFrame


In [7]:
adp_df = odo(adp_empl_sec, pd.DataFrame)

adp_df.plot(x='asof_date', y='total_private')
plt.xlabel("As Of Date (asof_date)")
plt.ylabel("Employment Levels")
plt.title("ADP Employment Level Data")
plt.legend().set_visible(False)



In [ ]: