Prisoner's dilemma

The prisoner's dilemma is a standard example of a game analyzed in game theory that shows why two completely "rational" individuals might not cooperate, even if it appears that it is in their best interests to do so.

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge. They hope to get both sentenced to a year in prison on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to: betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The offer is:

  • If A and B each betray the other, each of them serves 2 years in prison
  • If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa)
  • If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge)

It is assumed that both understand the nature of the game, and that despite being members of the same gang, they have no loyalty to each other and will have no opportunity for retribution or reward outside the game.

Because defection always results in a better payoff than cooperation, regardless of the other player's choice, it is a dominant strategy. The dilemma then is that mutual cooperation yields a better outcome than mutual defection but it is not the rational outcome because from a self-interested perspective, the choice to cooperate, at the individual level, is irrational.

All purely rational self-interested prisoners would betray the other because betraying a partner offers a greater reward than cooperating with them, and so the only possible outcome for two purely rational prisoners is for them to betray each other. The interesting part of this result is that pursuing individual reward logically leads both of the prisoners to betray, when they would get a better reward if they both kept silent. In reality, humans display a systemic bias towards cooperative behavior in this and similar games, much more so than predicted by simple models of "rational" self-interested action.

The prisoner's dilemma game can be used as a model for many real world situations involving cooperative behaviour. In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games: for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.

Exercise

The purpose is to show how the game evolves with different strategies. Lets kick some matches with a Prissoner's Dilemma matrix and 100 rounds per match.

Strategies:

  • Nice: Cooperates with 70% chance.
  • Crazy: Cooperates with 50% chance.
  • Alternate: Alternates cooperate / defeat.
  • AlternateCCD: Cooperates 2 of every 3 times.
  • Naive: Repeats the last play if the opponent cooperated.
  • NaiveProber: Defeats with 20% chance, else repeats the last play if the opponent cooperated.
  • TitForTat: Plays what the opponent played last time.
  • TitForTwoTats: Cooperates if the opponent cooperated the last 2 times.
  • SmoothTitForTat: Cooperates with 10% chance, othewise like tit for tat.
  • Selfish: Cooperates with 50% chance only if the opponent cooperated the last time.
  • Majority: Defects when the opponent defected more that 50% of the times.
  • Spiteful: Cooperates until opponent defeats. Then defeats always.

Pair the strategies as shoen in the code and look for the better results.


In [37]:
from pydilemma.game_play import *

play_with('Nice', 'TitForTat') # These 2 guys get along very well... 
#play_with('Nice', 'Naive') # Naive tries to get advantage of what works...
#play_with('Nice', 'NaiveProber') # And Naive Prober tries aggressively...
#play_with('NaiveProber', 'Majority') # But the NaiveProber can't compete against a Majority!
#play_with('SmoothTitForTat', 'TitForTwoTats') # On the other hand, Tit For Tat cousins destroy each other...
#play_with('Selfish', 'Crazy') # A Selfish guy gets a good chance against someone who doesn't know too much about the game...
#play_with('Alternate', 'Majority') # Some really simple strategies can get good results sometimes...


Alice scored 161 with strategy Nice
Bob scored 161 with strategy TitForTat
*** END ***

Exercise 2

We can simulate a population of strategies, put them to play, and see how the population evolve. These are the rules:

  1. There is a initial queue filled randomly with strategies passed as input.
  2. It pulls the first 2 players of the queue and performs a game with 50 rounds per game.
  3. If one player reaches 70 points, 2 players with the same strategy will be added to the queue. At that time, if the other one has more than 50 points, 1 player with the same strategy will be added to the queue. Both scores reset.
  4. Repeat until queue is empty or number of generations limit has reached.

Lets do some tests with 200 generations each, and try to get the best strategies.


In [39]:
from pydilemma.generational import *

Generational('PrisonerMatrix', ['TitForTat', 'Alternate', 'Majority', 'Naive', 'Nice'], 200).start()


Population size after 200 generations is 340
40.6% of the population are TitForTat
30.6% of the population are Majority
23.5% of the population are Naive
4.1% of the population are Nice
1.2% of the population are Alternate

References