Author Bosco Rodríguez Ballvé Date Fall 2016 Class Data Bootcamp @ NYU Stern Instructors Coleman, Lyon
A successful economy in the 21st century, in which the mix of products and services is changing constantly, requires a dynamic labor market as a mechanism to shift capital and labor. The ability of an economy to reallocate jobs across firms, industries, and geographical areas is, perhaps, even more important than capital.
Decades of persistently high unemployment in Spain, regardless of business cycle fluctuations, suggest that lack labor market dynamism has hindered Spain’s economy. Currently, almost a decade after the Great Recession, with economic recovery underway, Spain’s unemployment rate remains stubbornly high. Particularly amongst the youth. Chronic unemployment suggests deep rooted, structural causes that go beyond demand-deficient or cyclical unemployment.
The aim of this project is to compile and process data to shed ight on the relationship between education levels, age and structural unemployment in Spain.
In [3]:
import pandas as pd
from pandas_datareader import data, wb # we will be working with World Bank Data
import wbdata
import pandas
import matplotlib.pyplot as plt
import sys
import matplotlib as mpl
import matplotlib.pyplot as plt
import datetime as dt
%matplotlib inline
My first data source is the World Bank. We will access World Bank data by using 'Wbdata', Wbdata is a simple python interface to find and request information from the World Bank's various databases, either as a dictionary containing full metadata or as a pandas DataFrame. Currently, wbdata wraps most of the World Bank API, and also adds some convenience functions for searching and retrieving information.
Documentation is available at http://wbdata.readthedocs.org/
We install it with 'pip install wbdata'
Credits go to:
Sherouse, Oliver (2014). Wbdata. Arlington, VA. Available from http://github.com/OliverSherouse/wbdata.
Let's get to it.
In [4]:
wb.search('gdp.*capita.*const') # we use this function to search for GDP related indicators
Out[4]:
In [5]:
wb.search('employment') # we use this function to search for employment related indicators
Out[5]:
In [6]:
wb.search('unemployment') # we use this function to search for unemployment related indicators
Out[6]:
In [7]:
#I have identified the relevant variables in the three fields
#To download data for multiple indicators, I specify them as a list
#ESP is the ISO code for Spain
#I equalize the start and end dates
wb.download( indicator=['NY.GDP.PCAP.CD','SL.UEM.TOTL.ZS','SL.UEM.1524.ZS',
'SL.UEM.PRIM.ZS', 'SL.UEM.SECO.ZS','SL.UEM.TERT.ZS','SL.UEM.NEET.MA.ZS','SL.UEM.NEET.MA.ZS'],
country=['ESP'], start=1990, end=2015)
#Construct the dataframe
data = wb.download(indicator=['NY.GDP.PCAP.CD','SL.UEM.TOTL.ZS','SL.UEM.1524.ZS',
'SL.UEM.PRIM.ZS', 'SL.UEM.SECO.ZS','SL.UEM.TERT.ZS','SL.UEM.NEET.MA.ZS','SL.UEM.NEET.MA.ZS'],
country=['ESP'], start=1990, end=2015)
esplbr = pd.DataFrame(data)
#Rename the columns for clarity
esplbr.columns = ["GDP/capita(US$ 2016)", "UnemploymentRate", "YouthUnempRate", "UnempW/PrimEd.", "UnempW/SecEd","UnempW/TertEd", "Ni-nis"]
esplbr
#What on earth are Ni-nis? A Spanish neologism for "ni estudia, ni trabaja": percentage of youth "not working, not studying"
#A cultural and socioeconomic phenomenon
Out[7]:
In [8]:
# Wbata renders a complex multi-index, which I convert to old-school columns that are easier to work with
esplbr.reset_index(inplace=True)
esplbr
Out[8]:
In [9]:
esplbr.columns
Out[9]:
In [10]:
# housekeeping for column names
esplbr.columns = ["Country", "Year", "GDP/capita(US$ 2016)", "UnemploymentRate", "YouthUnempRate", "UnempW/PrimEd.", "UnempW/SecEd","UnempW/TertEd", "Ni-nis"]
esplbr
Out[10]:
In [11]:
# we know we are dealing exclusively with Spain, so we drop the reduntdant 'Country' column
esplbr.drop('Country', axis=1, inplace=True)
esplbr
Out[11]:
In [12]:
# what do I have in my hands?
esplbr.dtypes
Out[12]:
In [13]:
esplbr.index
Out[13]:
In [14]:
# with a clean and orthodox Dataframe, I can start to do some graphics
import matplotlib.pyplot as plt
%matplotlib inline
# we invert the x axis. Never managed to make 'Year' the X axis, lost a lot of hair in the process :(
plt.gca().invert_xaxis() # Came up with this solution
# and add the indicators
plt.plot(esplbr.index, esplbr['UnemploymentRate'])
plt.plot(esplbr.index, esplbr['YouthUnempRate'])
plt.plot(esplbr.index, esplbr['Ni-nis'])
# and modify the plot
plt.title('Labor Market in Spain', fontsize=14, loc='left') # add title
plt.ylabel('Percentage Unemployed') # y axis label
plt.legend(['UnemploymentRate', 'YouthUnempRate','Ni-nis'], fontsize=8, loc=0)
Out[14]:
Observations
Spain has recently lived through a depression without precedent, yet unemployment rates above 20% are nothing new: there is a large structural component in addition to the demand-deficient factor.
Youth unemployment is particuarly bad, which is the norm elsewhere too, but the spread is accentuated in Spain. Deductively, this hints at labor market duality between bullet-proof contracts and part-time or 'indefinite' contracts.
In [16]:
# let's take a look at unemployment by education level
import matplotlib.pyplot as plt
%matplotlib inline
# we invert the x axis
plt.gca().invert_xaxis()
#we add the variables
plt.plot(esplbr.index, esplbr['UnempW/PrimEd.'])
plt.plot(esplbr.index, esplbr['UnempW/SecEd'])
plt.plot(esplbr.index, esplbr['UnempW/TertEd'])
plt.plot(esplbr.index, esplbr['Ni-nis'])
# we modify the plot
plt.title('Education and Employment Outcomes', fontsize=14, loc='left')
plt.ylabel('Percentage Unemployed')
plt.legend(['UnempW/PrimEd.', 'UnempW/SecEd','UnempW/TertEd', 'Ni-nis'], fontsize=7, loc=0)
Out[16]:
Observations
P.S.: if you ever need to investigate (how not to execute) a Keynesian stimulus plan, check out how the government's Plan E added fuel to malinvestments http://www.economist.com/node/13611650
I'm interested in measuring structural unemployment. Ideally, I would build an unemployment model myself based on separation and accesion rates to arrive at the Natural Rate of Unemployment, as we see in one of my three bibles:
In the interest of time, I sought an indicator that acts as a proxy for structural unemployment. The NAIRU and NAWRU come to mind, but they are not reported by the World Bank.
And so I became acquainted with Quandl's API and proceeded to dig through several economic databases, and landed at the notorious OECD database: I suspect Quandl and I are going to become good friends moving forward.
In [17]:
# Don't forget the the DMV paperwork
import quandl # Quandl package
quandl.ApiConfig.api_key = '3w_GYBRfX3ZxG7my_vhs' # register for a key and unlimited number of requests
# Playing it safe
import sys # system module
import pandas as pd # data package
import matplotlib.pyplot as plt # graphics module
import datetime as dt # date and time module
import numpy as np
%matplotlib inline
We're going to be comparing Spain's NAIRU to that of Denmark. Don't tell Sanders, but Denmark is well known for having one of the most 'flexible' labor markets in Europe.
In [18]:
# We extract the indicators and print the dataframe
NAIRU = quandl.get((['OECD/EO91_INTERNET_ESP_NAIRU_A','OECD/EO91_INTERNET_DNK_NAIRU_A']), #We call for both
start_date = "1990-12-31", end_date = "2013-12-31") # And limit the time horizon
NAIRU
Out[18]:
In [19]:
# What do we have here?
type(NAIRU)
Out[19]:
In [20]:
NAIRU.columns
Out[20]:
In [21]:
# Dataframe housekeeping
NAIRU.columns = ['NAIRU Spain', 'NAIRU Denmark']
NAIRU
Out[21]:
In [22]:
# Nice and polished
NAIRU.columns
Out[22]:
In [23]:
plt.style.available #Take a look at the menu
Out[23]:
In [24]:
# We are ready to plot
import matplotlib.pyplot as plt
%matplotlib inline
#we add the variables
plt.plot(NAIRU.index, NAIRU['NAIRU Spain'])
plt.plot(NAIRU.index, NAIRU['NAIRU Denmark'])
#We modify the plot
plt.title('Measuring Structural Unemployment ESP v DEN', fontsize=15, loc='left') # add title
plt.ylabel('Percentage Unemployed') # y axis label
plt.legend(['NAIRU Spain', 'NAIRU Denmark'], fontsize=8, loc=2) # more descriptive variable namesDescribe what each of these arguments/parameters does
plt.style.use("bmh")
Observations
This project has left me at the doors of great questions,
that this course has given me the tools to answer,
and for that I thank you,
Bosco Rodríguez Ballvé
In [ ]: