This notebook is meant to be for a "quick" overview of what is presented in the paper. If you find this helpful, then we highly recommend reading the original paper as it is more thorough in its presentation of the material.
Consider repeated two-player games with perfect monitoring and discounting. We provide an algorithm that computes the set $V^*$ of payoff pairs of all pure-strategy subgame perfect equilibria with public randomization. The algorithm provides significant efficiency gains over existing implementations of the algorithm from Abreu, Pearce, Stacchetti (1990). These efficiency gains arise from a better understanding of the manner in which extreme points of the equilibrium payoff set are generated. An important theoretical implication of our algorithm is that the set of extreme points $E$ of $V^*$ is finite. Indeed, $|E| \leq 3 |A|$, where $A$ is the set of action profiles of the stage game.
This paper considers a two-player repeated game with finite simultaneous actions and discounting. The stage game will be denoted by $G = \{N, \{ A_i \}_{i \in N}, \{g_i\}_{i \in N} \}$ in which players $N=\{1, 2\}$ share a discount factor $\delta$.
We will begin with some definitions that will make the presentation more clean.
We will say that a point $v \in \mathcal{R}$ is generated by the set $X$ if there is an action pair $a \in A$ in the current period and a pair of continuation values $w \in X$ such that:
Then in conjunction with previous APS papers, we will say that a set $X$ is self-generating if each extreme point of $X$ can be generated by $X$.
In their original papers, Abreu, Pearce, and Stacchetti proposed a repeated application of a "$B$" operator to a set of continuation values and showed that the operator satisfied certain conditions that imply it will converge to the set of values which correspond to the subgame-perfect equilibria. The APS papers prove (among other things) that $B^n(W) \rightarrow V^*$. The $B$ operator is defined by
$$B(W) = \text{co}\{v | \exists w \in W, a \in A \text{ s.t } v=(1-\delta)g(a) + \delta w \text{ and } \delta(w - P(W)) \geq (1-\delta)h(a) \}$$Given a current set of continuation values, $W$, this operation could be split into several simple steps
One potential issue of this approach in a numerical setting is that the number of vertices in the set $B_a(W)$ isn't necessarily bounded -- Luckily, Abreu Sannikov will show (using the following theorem) that we can bound this and give a specific bound.
Theorem 1: Any action profile $a$ such that $g(a) \geq \underline{v} + (1-\delta)/\delta h(a)$ generates at most one extreme point of $V^*$, $v = g(a)$. Any action profile for which
- (1) $g_1(a) < \underline{v}_1 + \frac{1-\delta}{\delta} h_1(a)$ or
- (2) $g_2(a) < \underline{v}_2 + \frac{1-\delta}{\delta} h_2(a)$
generates at most four extreme points of $V^*$, using continuation payoff vectors $w$ that are extreme points of $Q(a, V^*, \underline{v})$ such that
- $\delta (w_1 - \underline{v}_1) = (1-\delta) h_1(a)$ or
- $\delta (w_2 - \underline{v}_2) = (1-\delta) h_2(a)$
This bound on the number of extreme points is a particularly important theoretical result and relies on the fact that (insert more here).
In [ ]: