Prisoners dilemma - exercises

Give the general definition for a Prisoners dilemma.

Bookwork: https://vknight.org/gt/chapters/09/#Prisoners-Dilemma
Justify if the following games are Prisoners dilemmas or not:
1. $ A = \begin{pmatrix} 3 & 0\\ 5 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} 3 & 5\\ 0 & 1 \end{pmatrix} $
  
  This is a Prisoners Dilemma: $(R, S, T, P) = (3, 1, 5, 0)$.
2. $ A = \begin{pmatrix} 1 & -1\\ 2 & 0 \end{pmatrix} \qquad B = \begin{pmatrix} 1 & 2\\ -1 & 0 \end{pmatrix} $
  
  This is a Prisoners Dilemma: $(R, S, T, P) = (1, -1, 2, 0)$: $2>1>0>-1$ and $2\times 1 > 2 - 1$.
3. $ A = \begin{pmatrix} 1 & -1\\ 2 & 0 \end{pmatrix} \qquad B = \begin{pmatrix} 3 & 5\\ 0 & 1 \end{pmatrix} $
  
  This is not a Prisoner's Dilemma $A \ne B ^ T$
4. $ A = \begin{pmatrix} 6 & 0\\ 12 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} 6 & 12\\ 0 & 0 \end{pmatrix} $
  
  This is not a Prisoner's Dilemma $A \ne B ^ T$
Obtain the Markov chain representation for a match between reactive strategies with the following vectors:

Bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Markov-chain-representation-of-a-Match-between-two-reactive-strategies
1. $p=(1/2, 1/2)\qquad q=(1/2, 1/2)$
  
  $$M = \begin{pmatrix} 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4 \end{pmatrix}$$
2. $p=(1/4, 1/2)\qquad q=(1/2, 1/4)$
  
  $$M = \begin{pmatrix} 1/8&1/8&3/8&3/8\\ 1/4&1/4&1/4&1/4\\ 1/16&3/16&3/16&9/16\\ 1/8&3/8&1/8&3/8 \end{pmatrix}$$
3. $p=(1/3, 1/3)\qquad q=(2/3, 1/4)$
  
  $$M = \begin{pmatrix} 2/9 & 1/9 & 4/9 & 2/9\\ 2/9 & 1/9 & 4/9 & 2/9\\ 1/12 & 1/4 & 1/6 & 1/2\\ 1/12 & 1/4 & 1/6 & 1/2 \end{pmatrix}$$
Obtain the utilities for both players for the vectors of question 3.

Bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Theorem:-steady-state-probabilities-for-match-between-reactive-players
1. $p=(1/2, 1/2)\qquad q=(1/2, 1/2)$ gives utilities: $(9/4, 9/4)$
2. $p=(1/4, 1/2)\qquad q=(1/2, 1/4)$ gives utilities: $(536/289, 621/289)$
3. $p=(1/3, 1/3)\qquad q=(2/3, 1/4)$ gives utilities: $(113/54, 49/27)$

Here is some python code that also carries out these calculations:



In [1]:

    
import numpy as np
import itertools

def make_matrix(p, q):
    """
    Code to obtain Markov chain representation of match between two reactive players.
    """
    M = [[ele[0] * ele[1] for ele in itertools.product([player, 1 - player], 
                                                       [opponent, 1 - opponent])]
         for opponent in q for player in p]
    return np.array(M)

def theoretic_steady_state(p, q):
    r_1 = p[0] - p[1]
    r_2 = q[0] - q[1]
    s_1 = (q[1] * r_1 + p[1]) / (1 - r_1 * r_2)
    s_2 = (p[1] * r_2 + q[1]) / (1 - r_1 * r_2)
    return np.array([s_1 * s_2, s_1 * (1 - s_2), (1 - s_1) * s_2, (1 - s_1) * (1 - s_2)])

def theoretic_utility(p, q, rstp=np.array([3, 0, 5, 1])):
    pi = theoretic_steady_state(p, q)
    return np.dot(pi, rstp)



In [2]:

    
import sympy as sym
for p, q in [([sym.S(1) / 2, sym.S(1) / 2], [sym.S(1) / 2, sym.S(1) / 2]),
             ([sym.S(1) / 4, sym.S(1) / 2], [sym.S(1) / 2, sym.S(1) / 4]),
             ([sym.S(1) / 3, sym.S(1) / 3], [sym.S(2) / 3, sym.S(1) / 4])]:
    print("=====")
    print(p, q)
    print("gives:")
    print(make_matrix(p, q))
    print("With utility:", theoretic_utility(p, q), theoretic_utility(q, p))









    



=====
[1/2, 1/2] [1/2, 1/2]
gives:
[[1/4 1/4 1/4 1/4]
 [1/4 1/4 1/4 1/4]
 [1/4 1/4 1/4 1/4]
 [1/4 1/4 1/4 1/4]]
With utility: 9/4 9/4
=====
[1/4, 1/2] [1/2, 1/4]
gives:
[[1/8 1/8 3/8 3/8]
 [1/4 1/4 1/4 1/4]
 [1/16 3/16 3/16 9/16]
 [1/8 3/8 1/8 3/8]]
With utility: 536/289 621/289
=====
[1/3, 1/3] [2/3, 1/4]
gives:
[[2/9 1/9 4/9 2/9]
 [2/9 1/9 4/9 2/9]
 [1/12 1/4 1/6 1/2]
 [1/12 1/4 1/6 1/2]]
With utility: 113/54 49/27

5. Assuming $p=(x, 1/2)$, find the optimal $x$ against the following players:

A part of this question involves bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Theorem:-steady-state-probabilities-for-match-between-reactive-players. This is a substitution exercise to obtain a formula for the utility of $p$ as a function of $x$.

$q=(1, 0)$

$$u(x)=\frac{(-10x + 4(x - 1)^2 + 13)}{(2x - 3)^2}$$

The derivative of this function is given by:

$$ \frac{2(6x - 7)}{(2x - 3)^3} $$

This derivative has zero for $x=7/6$ which is $>1$. Thus the utility is monotic increasing over the interval $[0, 1]$. We have (by substitution):

$$u(0)=17/9\qquad u(1)=3$$

Thus $u(x)$ is an increasing function so the optimal value of $x$ is $1$.

Against a player that is unforgiving (reacts to defection with defection), given that our player will play randomly against a defection it is better to always cooperate.
$q=(1/2, 1/2)$

$$u(x)=-3x/4+21/8$$

This is a decreasing function so the optimal value of $x$ is $0$.

Against a random player (who takes no notice of what we do) it is better to defect.

Below is some code to verify this calculations.



In [3]:

    
x = sym.Symbol("x")
for q in [(sym.S(1), sym.S(0)), (sym.S(1) / 2, sym.S(1) / 2)]:
    utility = theoretic_utility((x, sym.S(1) / 2), q)
    print(utility.simplify(), utility.subs({x: 0}), utility.subs({x: 1}), sym.diff(utility, x).simplify(), sym.solveset(sym.diff(utility, x), x))









    



(-10*x + 4*(x - 1)**2 + 13)/(2*x - 3)**2 17/9 3 2*(6*x - 7)/(2*x - 3)**3 {7/6}
-3*x/4 + 21/8 21/8 15/8 -3/4 EmptySet()