Dynamic Discrete Choice Models

In a dynamic discrete choice (DDC) model an agent chooses a series of actions according to its preferences, cognitive biases and beliefs about the future.

A common assumption for DDC models is the random utility assumption. Under the random utility assusmption, an agent chooses the action that maximizes their present discounted utility (in other words they have rational expectations about the future and their choices are free from cognitive bias). This means that the agent being analyzed by the researcher is behaving according to a Markov Decision Process.

In a Markov Decision Process (MDP), an agent picks a policy (a function mapping states to actions) that maximizes their present discounted utility. A MDP is a tuple, $(S, A, T, U, \beta)$, where

  • $S$ is the state space
  • $A$ is the action space
  • $T$ is a transition distribution (a function that takes a state and an action returns a probability distribution of states for the next period)
  • $U$ is function that maps state action pairs to utils
  • $\beta$ is the discount rate

DDC models typically partition the state into two pieces: those observed by the researcher and those that are not.

  • $x$: tuple of agent state variables observed by the researcher
  • $\epsilon$: tuple of agent state variables unobserved by the researcher

DDC models also make a number of assumptions about the structure of the MDP in order to make estimation simple. Some common assumptions are:

  • Additive seperability: The utility function is given by $\mu(s, a) = u(x, a) + \epsilon$
  • Conditional independence: The transition function can be decomposed into $T(s' | s, a) = P_x(x' | x, a)P_\epsilon(\epsilon' | \epsilon, a)$
  • IID unobserved state variable: $\epsilon$ is IID
  • Conditional Logit: $\epsilon$ is Gumbel distributed

One of the first methods to estimate the parameters of the utility function in a DDC given a series of state and action observations is the NFXP method.

NFXP

The NFXP uses the Additive seperability, conditional independence, IID unobserved state variable and conditional logit assumptions to greatly simplify estimation. An outline of how the assumptions simplify the value function is shown below.

$$ \begin{aligned} V(x,\epsilon)&=\max_{d\in D}\mu(x,d,\epsilon)+\beta E_{x',\epsilon'}[V(x',\epsilon'|x,\epsilon)]\\ &=\max_{d\in D}u(x,d)+\epsilon+\beta E_{x',\epsilon'}[V(x',\epsilon'|x,\epsilon)]\quad\text{AS}\\ &=\max_{d\in D}u(x,d)+\epsilon+\beta\int_{x'}\int_{\epsilon'}V(x',\epsilon')p(x',\epsilon'|x,\epsilon,d)d\epsilon'dx'\\ &=\max_{d\in D}u(x,d)+\epsilon+\beta\int_{x'}\int_{\epsilon'}V(x',\epsilon')p(\epsilon'|\epsilon,d)q(x'|x,d)d\epsilon'dx'\quad\text{CI}\\ &=\max_{d\in D}u(x,d)+\epsilon+\beta\int_{x'}\int_{\epsilon'}V(x',\epsilon')p(\epsilon_{1})\cdots p(\epsilon_{|\epsilon|})d\epsilon'q(x'|x,d)dx'\quad\text{IID}\\ &=\max_{d\in D}u(x,d)+\epsilon+\beta\int_{x'}\overline{V(x')}q(x'|x,d)dx' \end{aligned} $$

When the observed state variables, x, are discrete the integral is replaced with a summation

$$ \begin{aligned} \overline{V(x)}&= \int_{\epsilon}\left(\max_{d\in D} u(x,d) + \epsilon(d) + \beta\sum_{x'}\overline{V(x')} q(x' | x, d)\right)dP(\epsilon) \notag\\ &=\log\left[\sum_{d=0}^{D}\exp\left\{u(x,d)+\beta\sum_{x'}\overline{V(x')}q(x'|x,d)\right\} \right]\quad\text{Gumbel Distribution} \end{aligned} $$

The probability of choosing an action given the states observed by the researcher is

$$ P(d|x_{t},\theta)=\frac{\exp\{v(d,x_{t}\}}{\sum_{j=0}^{J}\exp\{v(j,x_{t})\}} $$

This bears a strong resemblance to the choice proability in logistic regression. The likelihood is simply the probability density of the first observation times the transition probabilities for the whole series times the choice probabilities. Logged this gives

$$ l_{i}(\theta)=\sum_{t=1}^{T}\log P(d=d_{it}|x_{it},\theta)+\left(\sum_{t=1}^{T-1}\log q(x_{it+1}|x_{it},d_{it},\theta)\right)+\log Pr(x_{i1}|\theta) $$

Interfaces

Dynamic Discrete Choice

Method Description
fit(ddcfamily, data, algorithm) Fit the dynamic discrete choice model on some data using an algorithm

Nested Fixed Point

Method Description
stage1_problem(nfxpfamily, data) Get the problem description for stage 1. This description is used by a solver to find the optimal transition parameters
stage2_problem(nfxpfamily, data) Get the problem description for stage 2. This description is used by a solver to find the optimal utility parameters

MDP Result

Method Description
policy(ddcresult, state) Return the policy at the current state
sample(ddcresult, state, n) Generate n samples from the MDP, using state as the starting state

Discrete Transition

Method Description
condprob(transition, state_ind, action_ind) Return the conditional dist. of a transition
prob(transition, state_ind, action_ind, new_state_ind) Return the probability of a transition occuring

Both condprob and prob should return a vector of probabilities.

Discrete VariableSpaces

Method Description
levels(vs) Return an iterable of all possible levels

Estimators

  • Nested Fixed Point (MPEC)
  • Nested Pseudo Likelihood
  • Mathematical Program with Equilibrium Constraints (MPEC)
  • Expectation Maximization

Confidence Interval

  • Parametric Bootstrap
  • Resampling Bootstrap
  • BHHH

Statistics

  • cross-validation

Bibliography