Repeated games

Definition of a repeated game

Given a two player game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, referred to as a stage game, a $T$-stage repeated game is a game in which players play that stage game for $T > 0$ periods. Players make decisions based on the full history of play over all the periods.

For example consider the game:

$$ A = \begin{pmatrix} 0 & 6 & 1\\ 1 & 7 & 5 \end{pmatrix} \qquad B = \begin{pmatrix} 0 & 3 & 1\\ 1 & 0 & 1 \end{pmatrix} $$

by identifying the best responses:

$$ A = \begin{pmatrix} 0 & 6 & 1\\ \underline{1} & \underline{7} & \underline{5} \end{pmatrix} \qquad B = \begin{pmatrix} 0 & \underline{3} & 1\\ \underline{1} & 0 & \underline{1} \end{pmatrix} $$

it is immediate to find two Nash equilibria:

$$((0,1), (1,0,0))\qquad((0,1), (0,0,1))$$

If we were to repeat this game twice ($T=2$) we obtain a new game. However to be able to think of this we need to define what a strategy in a repeated game is.

Definition of a strategy in a repeated game

Given a two player stage game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, repeated to give a $T$-stage repeated game. A strategy is a mapping from the entire history of play to an action of the stage game:

$$ \bigcup_{t=0}^{T-1}H(t)\to a $$

where:

$H(t)$ is the history of play of both players up until stage $t$ ($H(0)=(\emptyset, \emptyset)$)
$a$ is an action (for either player) of the stage game

To help avoid confusion, whilst we have referred to pure strategies as choices made in stage games, here we will call those actions.

The actions for our example:

for the row player: $\{r_1, r_2\}$ (corresponding to the rows)
for the column player: $\{c_1, c_2, c_3\}$ (corresponding to the columns)

A strategy for the row/column player thus needs to map an element of the following set to an element of $\{r_1, r_2\}$/$\{c_1, c_2, c_3\}$:

$$\bigcup_{t=0}^{1}H(t)=\{(\emptyset, \emptyset), (r_1, c_1), (r_1, c_2), (r_1, c_3), (r_2, c_1), (r_2, c_2), (r_3, c_3)\}$$

In other words, in our example, a strategy answers both of the following questions:

What should the player do in the first period?
What should the player do in the second period given knowledge of what both players did in the first period?

The following theorem allows us to find a Nash equilibrium:

Theorem of sequence of stage Nash equilibria

Video

For any repeated game, any sequence of stage Nash profiles gives a Nash equilibrium.

Proof

Consider the following strategy:

The row/column player should play action $a_{r/c}$ regardless of the play of any previous strategy profiles.

where $(a_{r}, a_{c})$ is a given stage Nash equilibrium.

Using backwards induction, this is a Nash equilibrium for the last stage game. Thus, at the last stage, no player has a reason to deviate. Similarly at the $T-1$th stage. The proof follows.

Thus, for our example we have the four Nash equilibria:

$(r_2r_2, c_1c_1)$ with utility: (2, 2).
$(r_2r_2, c_1c_3)$ with utility: (6, 2).
$(r_2r_2, c_3c_1)$ with utility: (6, 2).
$(r_2r_2, c_3c_3)$ with utility: (10, 2).

Note however that it is not the only equilibria for our repeated game.

Reputation

Video

In a repeated game it is possible for players to encode reputation and trust in their strategies.

Consider the following two strategies:

For the row player:

$$(\emptyset, \emptyset) \to r_1$$ $$(r_1, c_1) \to r_2$$ $$(r_1, c_2) \to r_2$$ $$(r_1, c_3) \to r_2$$
For the column player:

$$(\emptyset, \emptyset) \to c_2$$ $$(r_1, c_2) \to c_3$$ $$(r_2, c_2) \to c_1$$

Note that here we omit some of the histories which are not possible based on the first play by each player.

This strategy corresponds to the following scenario:

Play $(r_1,c_2)$ in first stage and $(r_2,c_3)$ in second stage unless the row player does not cooperate in which case play $(r_2, c_1)$.

If both players play these strategies their utilities are: $(11, 4)$ which is better for both players then the utilities at any sequence of pure stage Nash equilibria. But is this a Nash equilibrium? To find out we investigate if either player has an incentive to deviate.

If the row player deviates, they would only be rational to do so in the first stage, if they did they would gain 1 in that stage but lose 4 in the second stage. Thus they have no incentive to deviate.
If the column player deviates, they would only do so in the first stage and gain no utility.

Thus this strategy pair is a Nash equilibrium and evidences how a reputation can be built and cooperation can emerge from complex dynamics.