This is a jupyter notebook that can be executed with an R kernel running in the background to execute R code.
The content below draws heavily upon these two excellent souces:
Cattaneo, Idrobo and Titiunik (2008) "A Practical Introduction to Regression Discontinuity Designs: Part I," in Cambridge Elements: Quantitative and Computational Methods for Social Science, Cambridge University Press. See also "Part II" paper.
Meyersson (2014): Islamic Rule and the Empowerment of the Poor and Pious, Econometrica 82(1): 229-269.
Links to these papers and data and both Stata and R code for replication at RD Software Packages site.
Assignment to treatment status is a deterministic and discontinuous function of a covariate (running variable or forcing variable) $X_i$.
$$ D_i= \left\{ \begin{array}{@{}ll@{}} 1 & \text{if}\ X_i\ge X_0 \\ 0 & \text{if}\ X_i \lt X_0 \end{array} \right. $$where $X_0$ is a threshold or cutoff.
Example: National Merit scholarship awarded to all students with PSAT score above some threshold.
$Y_i(1)$ and $Y_i(0)$ are potential outcomes observed under treatment or control, respectively. $E[Y_i(1)|X_i]$ observed only to right of cutoff. $E[Y_i(0)|X_i]$ only to left.
Treatment effect estimated as
$$ \tau_{SRD}=E[Y_i(1)-Y_i(0)|X_i=X_0] $$Local treatment effect at/near the cutoff. In practice SRD estimate estimated as difference between weighted average of outcomes on the either side of the discontinuity.
Conditional independence:
$$E[Y_i(0) |X_i , D_i ] = E[Y_i(0) |X_i ]$$once we control for cofounder $X_i$, treatment assignment as good as random.
For observations very close to the discontinuity we effectively have an experiment.
Non-linearity must not be mistaken for a discontinuity. In this example with linear fits on each side, estimate a positive 'treatment' effect. More likely relationship is non-linear with zero treatment effect.
In [41]:
#control the size of R plots in jupyter
options(repr.plot.width=5, repr.plot.height=5)
In [42]:
# Load R libraries
library(foreign)
library(ggplot2)
library(lpdensity)
library(rddensity)
library(rdrobust)
library(rdlocrand)
library(TeachingDemos)
In [43]:
data = read.dta("CIT_2018_Cambridge_polecon.dta")
In [44]:
Y = data$Y
X = data$X
T = data$T
T_X = T*X
Raw Comparison of means (Figure 2.3a)
If we simply compare outcomes in treated (Islamic major) and non-treated areas (secular major), we find lower female educational attainment in areas where Islamic parties won.
In [45]:
rdplot(Y, X, nbins = c(2500, 500), p = 0, col.lines = "red", col.dots = "lightgray", title = "",
x.label = "Islamic Margin of Victory", y.label = "Female High School Share", y.lim = c(0,70))
In last plot we compared mean value in treated and control group. But these two groups differ considerably.
Below we focus on more 'local' effects (closer to just above and below the cutoff).
In next plot we fit a 4th degree polynomial on each side but now limit bandwidth to 'closer' contests where absolute victory margin was within 50 points.
In [46]:
rdplot(Y[abs(X) <= 50], X[abs(X) <= 50], nbins = c(2500, 500), p = 4, col.lines = "red",
col.dots = "lightgray", title = "", x.label = "Islamic Margin of Victory",
y.label = "Female High School Share", y.lim = c(0,70))
In [47]:
rdplot(Y[abs(X) <= 10], X[abs(X) <= 10], nbins = c(2500, 500), p = 4, col.lines = "red", col.dots = "lightgray", title = "",
x.label = "Islamic Margin of Victory", y.label = "Female High School Share", y.lim = c(0,70))
In [48]:
rdplot(Y, X, nbins = c(20,20), binselect = 'esmv', x.label = 'Running Variable', y.label = 'Outcome',
title = '', y.lim = c(0,25))
The estimating equation:
$$ Y_i(0) = f(X_i) + \epsilon _i $$$$Y_i(1) = Y_i(0) + \beta$$All in one: $$ Y_i = f(X_i) +\beta D_i + \epsilon_i $$
The function $f(X_i)$ (e.g. a polynomial) must be continuous at $X_0$
The RD estimate is difference between weighted average of outcomes on the either side of the discontinuity. Fitting a high order polynomial can mean these weighted averages are driven by observations far from the threshold.
Parametric (polynomial on full data) or non-parametric (local regression closer to cutoff) approaches?
Andrew Gelman's blog post from 8/13 discusses this issue. See also Development Impact Blog
Chen et al (2013) "Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy,” PNAS.
Policy discontinuity: North of China's Huai river free coal for heating is distributed in winter. None ti the south. Using RDD method find total suspended particulates (TSPs) air pollution 55% higher just North of river compared to just south. Estimates China’s coal-burning was reducing lifespan by 5 years for half a billion people.
Are observations far from threshold affecting polynomial fit, driving results? Smaller estimates with linear fits?
A fuzzy RD is like an RD with imperfect compliance or non-compliance. Assignment to Treatment is as before summarized by indicator $I({X_i \ge X_0})$ but compliance is now imperfect. Like an Intent-to-Treat (ITT) estimator, the measured jump at the cutoff in the outcome needs to be rescaled.
Structural equation: $Y_i = f(X_i)+\beta D_i +\epsilon_i$
2nd stage reduced form: $Y_i = f(X_i)+\pi_2 I({X_i \ge X_0} +\xi_{2i}$
First stage: $D_i = g(X_i)+\pi_1 I({X_i \ge X_0} +\xi_{1i}$
Standard IV: $ \beta=\frac{\pi_2}{\pi_1} $
Example: Third graders have to go to summer school if score below $X_{10}$ in reading test OR below $X_{20}$ in math test. Does summer school raise test performance next year?
With two forcing variables