Game theory
October 13, 2016 — January 5, 2025
I have nothing to say about foundational game theory itself, except to note that JD Williams’ book, The Compleat Strategyst (Williams 1966) is online for free, so you should get it.
How long until we approach Nash equilibrium, also includes a note on Aumann’s correlated equilibrium which I would like to know about.
1 Prisoner’s dilemma
When people talk about game theory they usually mean the class of mathematically formulated two-player “games,” which are typically not the fun type of games, being much shorter and more brutal than Carcassone or whatever.
The most famous one is the Prisoner’s dilemma; you’ve probably run into this one. Alice and Bob, co-conspirators, have been arrested by the cops for a crime they did commit, and they are interviewed separately. They offer them each the same choice: “Inform on your buddy and we will let you off lightly.” Obviously, they want to have as little time in prison; what should they each do?
There are four possible outcomes:
- Both defect — Alice and Bob both turn informant: They both go to prison for 10 years
- Bob defects — Bob informs and Alice stays stumm: Alice goes to prison for 10 years, Bob walks free in 1 year
- Alice defects — Bob stays stumm and Alice informs: Bob goes to prison for 10 years, Alice walks free in 1 year
- Both cooperate — Alice and Bob both stay stumm: They each go to prison for 2 years on a lesser charge
Label each player’s two actions
Bob: C | Bob: D | |
---|---|---|
Alice: C | (R,R) | (S,T) |
Alice: D | (T,S) | (P,P) |
where the usual ordering is
A player’s best response to a fixed opponent strategy is the action that maximises her payoff given what the other does.
If Bob plays
, Alice’s payoffs are so Alice’s best response is .If Bob plays
, Alice’s payoffs are so Alice’s best response is again .
Because
A Nash equilibrium is a profile of strategies where each player is playing a best response to the others.
- Here, since both players’ best response is always
, is the unique Nash equilibrium.
An outcome is Pareto‐efficient (PE) if there’s no alternative outcome that makes at least one player strictly better off without making anyone worse off.
Compare the four outcomes of PD:
yields . yields . yields (asymmetric). yields .
is Pareto‐efficient: you can’t move to another outcome that raises one player’s payoff without dropping the other’s below 3. is not Pareto‐efficient: both players could switch jointly to and each move from 1→3, so Pareto‐dominates .Dominance ⇒ Defection: Because
is each player’s best response to anything, rational play leads to .Pareto efficiency ⇒ Cooperation: In terms of group welfare,
is strictly better for everybody than .
This gap—individual incentives pushing towards
This is a normal-form game, where the players choose their actions simultaneously and independently. Sequential and partially-observed games are more complicated, and we handle them in extensive form. Those pop up in e.g. causal agents.
2 Iterated games
Iterated games are a class of games where the players play the same game multiple times, and they can use the results of previous rounds to inform their decisions in later rounds. These parallel many interesting dynamics in the real world. See Iterated and evolutionary game theory for a more detailed discussion of iterated games.
3 Stochastic games
In standard game theory, mixed strategies involve probabilistic choices over pure strategies. Stochastic games extend this by incorporating state transitions that evolve over time.
Stochastic games combine game theory with Markov decision processes. Players make decisions in a sequence of states, with each action affecting both immediate payoffs and the probability distribution of future states. Unlike standard mixed strategies where randomisation occurs only over action choices, stochastic games include:
- Multiple states that change over time
- Transition probabilities between states
- State-dependent payoff structures
- Potentially infinite horizons
In this framework, optimal strategies must account not only for immediate rewards but also for how actions influence the game’s future trajectory. This makes stochastic games particularly suitable for modelling economic competition, resource management, and multi-agent reinforcement learning scenarios where the environment changes in response to players’ actions.
4 Shapley value
The Shapley value is a model of fair distribution of the proceeds of a coalition game. Ends up being interesting for feature selection and collective action problems and model explanation. TBC
5 Bargaining and commitment
See commitment for a discussion of bargaining and commitment in the context of game theory.