Introduction to Structural Estimation

... background material available at https://github.com/softecon/talks

Usefulness of Structural Econometric Models

  • Verification How accurately does the computational model solve the underlying equations of the model for the quantities of interest?

  • Validation How accurately does the model represent the reality for the quantities of interest?

  • Uncertainty Quantification How do the various sources of error and uncertainty feed into uncertainty in the model-based prediction of the quantities of interest?

Goals of Structural Estimation

  • identify deep parameters and mechanisms governing economic behavior

  • provide counterfactual policy predictions

  • determine optimal policies

  • integrate separate results in unified framework

Deep Parameters

  • Individual's Utility Function

    • risk preference
    • desire for probabilistic protection
    • elasticity of (intertemporal) substitution
    • discount factor
  • Firm's Production Function

    • elasticity of (input) substitution
  • Econmic Environment

    • level of uncertainty

$\rightarrow$ invariant to changes in policy.

Alternative Estimation Strategies

  • Traditional

    • Maximum Likelihood
    • Method of Moments
    • Generalized Method of Moments
  • Simulation-Based

    • Simulated Maximum Likelihood
    • Simulated Method of Moments
    • Indirect Inference

Toy Example

$$ \begin{align*} & \\ V^* &= \theta + \epsilon, \qquad \epsilon \sim \mathcal{N}(0, 1)\\ & \\ D & = \begin{cases} 1 & V^* > 0 \\ 0 & \text{otherwise}\\ \end{cases} & \\ & \\ \Pr[D & = 1 \mid \theta] = P(\theta) = \Phi(\theta)\\ \end{align*} $$

Maximum Likelihood Estimation

We attempt to find the values of the parameters which would have most likely produced the empirical data.

$$ \begin{align*} & \\ \hat{\theta}^{ML} = \arg \max_{\theta} \log \mathcal{L}(\theta; D) \end{align*} $$

In our case, the criterion function is straightforward to derive.

$$ \begin{align*} \log \mathcal{L}(\theta; D) & = \sum^N_{i} D_i \log P(\theta) + (1 - D_i) \log(1 - P(\theta)) \\ \end{align*} $$

Method of Moments

We attempt to minimize the discrepancy between observed and theoretical moments.

$$ \begin{align*} \\ \hat{\theta}^{MM} & = \arg \min_{\theta} (\mu(\theta) - m(D))^2 \\ & \\ \end{align*} $$

For example, we can use the first moments.

$$ \begin{align*} &\\ \mu(\theta) & = E[D] = P(\theta) = \Phi(\theta) \\ m(D) & = \frac{1}{N} \sum^N_{i = 1} D_i \end{align*} $$

Generalized Method of Moments

We attempt to minimize a weighted discrepancy between the observed and theoretical moments.

$$ \begin{align*} \\ \hat{\theta}^{GMM} = \arg \min_{\theta} (\mu(\theta) - m(D))' W (\mu(\theta) - m(D)) \end{align*} $$
  • Which and how many moments to choose?

  • How to choose the weighing matrix?

Simulation-Based Estimation

$$ \begin{align*} \\ P(\theta) = \int^{\theta}_{-\infty} \phi(\epsilon) d\epsilon = \int^{\infty}_{-\infty} I[\theta + \epsilon > 0]\phi(\epsilon) d\epsilon \end{align*} $$

We could evaluate the choice probability by simply drawing a set of $R$ standard normal deviates and applying the indicator function to each one.

$$ \begin{align*} \hat{P}(\theta) = \frac{1}{R} \sum^R_{r=1} I[\theta + \epsilon^r > 0] \end{align*} $$

In [1]:
def criterion_ml(theta, data_obs, type_, num_sim):
    """ This function evaluates the criterion for traditional and
    simulation-based maximum likelihood estimation.
    """
    if type_ == 'traditional':
        probs = norm.cdf(theta)
    elif type_ == 'AR-simulator':
        data_sim, _ = simulate_sample(theta, num_sim)
        probs = np.mean(data_sim)
    else:
        raise AssertionError('Invalid request')

    rslt = data_obs * np.log(probs) + (1.0 - data_obs) * np.log(1.0 - probs)

    return np.sum(rslt)
Chatter in Criterion Function
Maximum-Likelihood Estimation, R = 100
Method of Moments Estimation, R = 100
Maximum-Likelihood Estimation, R = 10,000

Logit-Smoothed Accept-Reject Simulator

$$ \begin{align*} \\ \hat{P}(\theta) = \frac{1}{R} \sum^R_{r=1} \frac{\exp\{(\theta + \epsilon_r) \lambda^{-1}\}}{1 + \exp\{(\theta + \epsilon_r) \lambda^{-1}\}}\\ \end{align*} $$
  • simulated probabilities are smooth in the parameters

  • no zero-probability events

Logistic Function

In [ ]:
def criterion_mm(theta, data_obs, type_, num_sim, lambda_):
    """ This function evaluates the criterion for traditional and 
    simulation-based method of moments estimation. 
    """
    if type_ == 'traditional':
        stat = norm.cdf(theta)[0]
    elif type_ == 'AR-simulator':
        data_sim, _ = simulate_sample(theta, num_sim)
        stat = np.mean(data_sim)
    elif type_ == 'smoothed-AR-simulator':
        _, u = simulate_sample(theta, num_sim)
        stat = np.mean(get_smoothed_probabilities(u, lambda_), axis=0)[1]
    else:
        raise AssertionError('Invalid request')

    rslt = (np.mean(data_obs) - stat) ** 2

    return rslt
Method of Moments Estimation, Smoothing

Indirect Inference

The basic idea is to use an auxiliary model to describe the data and use its parameters as the empirical moments. Our goal then is to minimize the discrepancy between simulated and observed moments. Alternative distance metrics exist and often a weighting matrix is used.

$$ \begin{align*} \\ \hat{\theta} = \arg\min_{\theta} (\hat{\beta}^{OBS} - \hat{\beta}^{II}(\theta))^2 \end{align*} $$

In [ ]:
def criterion_ii(theta, data_obs, num_sim):
    """ This function evaluates the criterion for indirect inference 
    estimation.
    """
    # Run auxiliary model on observed data.
    beta_obs = sm.OLS(data_obs, np.tile(1, len(data_ob))).fit().params[0]

    # Simulate a dataset with candidate parametrization and run auxiliary 
    # model.
    data_sim, _ = simulate_sample(theta, num_sim)
    beta_ii = sm.OLS(data_sim, np.tile(1, len(data_sim))).fit().params[0]

    return (beta_obs - beta_ii) ** 2
Indirect Inference

Research Examples

Eisenhauer, P., Heckman, J. and Mosso, S. (2015). The Estimation of Dynamic Discrete Choice Models by Maximum Likelihood and the Simulated Method of Moments. International Economic Review, 56(2):331-357.

  • Comparison of Traditional and Simulation-Based Estimation

  • Choice of Weighing Matrix

  • Choices of Optimization Algorithm

Choice of Weighing Matrix
Choice of Optimization Algorithm

Eisenhauer, P. (2016). The Approximate Solution of Finite-Horizon Discrete Choice Dynamic Programming Models: Revisiting Keane & Wolpin (1994), available at https://github.com/structRecomputation

  • Successful Recomputation

  • Additional Quality Diagnostics of $E\max$ Approximation

Interpolation Function

$$\begin{align*} \\ E \max - \max E = \pi_0 + \sum^4_{j = 1} \pi_{1j} (\max E - \bar{V}_j) + \sum^4_{j = 1} \pi_{2j} \left(\max E - \bar{V}_j\right)^{\tfrac{1}{2}} \end{align*}$$
Approximation Scheme and Simulation
Approximation Scheme and Estimation

Practical Estimation Exercise

Transparency, Recomputability, and Extensibility

Curse of Dimensionality

Typical Workflow on Compute Machine

  • respy-compare, compare the observed and simulated economies

  • respy-estimate, start an estimation run

    • just-in-time monitoring of progress and quality
  • respy-update, update the initialization files with estimation results

  • respy-modify, modify the parametrization in the initialization file

Contact



Philipp Eisenhauer



Software Engineering for Economists Initiative