Infinite-Horizon Repeated Prisoner's Dilemma

Content map: SMU H3 Game Theory Map

Setup

Definition:

Infinite-Horizon Repeated Prisoner’s Dilemma

Players: Two players, Player 1 and Player 2.

Strategies: Each player chooses $C$ or $D$ in every period; Grim Trigger: start with $C$ ; after any defection by any player, choose $D$ forever.

Rules

Start with an infinitely repeated Prisoner’s Dilemma stage game; Players choose $C$ or $D$ each period under grim-trigger strategies.
The player who sustains cooperation by making deviation unprofitable supports the cooperative outcome.
The stage game is repeated infinitely many times.
Future payoffs are discounted by $\delta \in (0,1)$ ; Total utility is the present discounted value of the payoff stream.

Payoff Matrix

	C	D
C	4, 4	-2, 6
D	6, -2	0, 0

Derivation (Best Response Analysis)

There are two relevant states under Grim Trigger:
- cooperative state: no deviation has occurred,
- punishment state: some deviation has occurred.
In the cooperative state, obeying Grim Trigger yields

V_C = 4 + \delta V_C = \frac{4}{1-\delta}.

Deviating once yields

V_D = 6 + \delta \cdot 0 = 6,

because all future play switches to $(D,D)$ .

Cooperation is optimal in the cooperative state if and only if

\frac{4}{1-\delta} \geq 6

\delta \geq \frac{1}{3}.

In the punishment state, sticking with $D$ gives $0$ forever.
Deviating to $C$ gives $-2$ immediately and does not restore cooperation, so it is worse than staying with $D$ .

Derivation (Nash Equilibrium)

By the one-shot deviation principle, Grim Trigger is subgame perfect exactly when no profitable one-period deviation exists in either state.
The punishment state is always incentive compatible.
The cooperative state is incentive compatible if and only if $\delta \geq \tfrac{1}{3}$ .

Nash Equilibrium

Result:

If $\delta \geq \tfrac{1}{3}$ , the strategy profile in which both players use Grim Trigger is a subgame perfect Nash equilibrium and sustains $(C,C)$ forever on the equilibrium path.

The socially optimal path is perpetual cooperation, giving each player $\frac{4}{1-\delta}$ .
Permanent defection yields each player $0$ .
Grim Trigger can decentralise the social optimum when players are sufficiently patient.

Insights

Insight:

A severe and credible punishment can overturn the one-shot dominance of defection.

The condition $\delta \geq \tfrac{1}{3}$ measures how much players value the future relative to the current gain from cheating.

Infinite-Horizon Repeated Prisoner's Dilemma

Setup

Rules

Payoff Matrix

Derivation (Best Response Analysis)

Derivation (Nash Equilibrium)

Nash Equilibrium

Social Optimum

Insights