Game theory

October 13, 2016 — January 5, 2025

bounded compute
cooperation
economics
game theory
incentive mechanisms
mind

I have nothing to say about foundational game theory itself, except to note that JD Williams’ book, The Compleat Strategyst () is online for free, so you should get it.

How long until we approach Nash equilibrium, also includes a note on Aumann’s correlated equilibrium which I would like to know about.

Figure 1

1 Prisoner’s dilemma

When people talk about game theory they usually mean the class of mathematically formulated two-player “games,” which are typically not the fun type of games, being much shorter and more brutal than Carcassone or whatever.

The most famous one is the Prisoner’s dilemma; you’ve probably run into this one. Alice and Bob, co-conspirators, have been arrested by the cops for a crime they did commit, and they are interviewed separately. They offer them each the same choice: “Inform on your buddy and we will let you off lightly.” Obviously, they want to have as little time in prison; what should they each do?

There are four possible outcomes:

  1. Both defect — Alice and Bob both turn informant: They both go to prison for 10 years
  2. Bob defects — Bob informs and Alice stays stumm: Alice goes to prison for 10 years, Bob walks free in 1 year
  3. Alice defects — Bob stays stumm and Alice informs: Bob goes to prison for 10 years, Alice walks free in 1 year
  4. Both cooperate — Alice and Bob both stay stumm: They each go to prison for 2 years on a lesser charge

Label each player’s two actions C (cooperate) and D (defect), and let payoffs be:

Bob: C Bob: D
Alice: C (R,R) (S,T)
Alice: D (T,S) (P,P)

where the usual ordering is T>R>P>S,2R>T+S. Commonly we set T=5,R=3,P=1,S=0.

A player’s best response to a fixed opponent strategy is the action that maximises her payoff given what the other does.

  • If Bob plays C, Alice’s payoffs are {play CR=3,play DT=5, so Alice’s best response is D.

  • If Bob plays D, Alice’s payoffs are {play CS=0,play DP=1, so Alice’s best response is again D.

Because D strictly dominates C (it’s best regardless of the other’s move), defecting is each player’s unique best response.

A Nash equilibrium is a profile of strategies where each player is playing a best response to the others.

  • Here, since both players’ best response is always D, (D,D) is the unique Nash equilibrium.

An outcome is Pareto‐efficient (PE) if there’s no alternative outcome that makes at least one player strictly better off without making anyone worse off.

  • Compare the four outcomes of PD:

    1. (C,C) yields (3,3).
    2. (D,D) yields (1,1).
    3. (D,C) yields (5,0) (asymmetric).
    4. (C,D) yields (0,5).
  • (C,C) is Pareto‐efficient: you can’t move to another outcome that raises one player’s payoff without dropping the other’s below 3.

  • (D,D) is not Pareto‐efficient: both players could switch jointly to (C,C) and each move from 1→3, so (C,C) Pareto‐dominates (D,D).

  • Dominance ⇒ Defection: Because D is each player’s best response to anything, rational play leads to (D,D).

  • Pareto efficiency ⇒ Cooperation: In terms of group welfare, (C,C) is strictly better for everybody than (D,D).

This gap—individual incentives pushing towards (D,D) despite a better mutual outcome at (C,C)—is the model of that stylised fact we often call a social dilemma.

This is a normal-form game, where the players choose their actions simultaneously and independently. Sequential and partially-observed games are more complicated, and we handle them in extensive form. Those pop up in e.g. causal agents.

2 Iterated games

Iterated games are a class of games where the players play the same game multiple times, and they can use the results of previous rounds to inform their decisions in later rounds. These parallel many interesting dynamics in the real world. See Iterated and evolutionary game theory for a more detailed discussion of iterated games.

3 Stochastic games

In standard game theory, mixed strategies involve probabilistic choices over pure strategies. Stochastic games extend this by incorporating state transitions that evolve over time.

Stochastic games combine game theory with Markov decision processes. Players make decisions in a sequence of states, with each action affecting both immediate payoffs and the probability distribution of future states. Unlike standard mixed strategies where randomisation occurs only over action choices, stochastic games include:

  1. Multiple states that change over time
  2. Transition probabilities between states
  3. State-dependent payoff structures
  4. Potentially infinite horizons

In this framework, optimal strategies must account not only for immediate rewards but also for how actions influence the game’s future trajectory. This makes stochastic games particularly suitable for modelling economic competition, resource management, and multi-agent reinforcement learning scenarios where the environment changes in response to players’ actions.

4 Shapley value

The Shapley value is a model of fair distribution of the proceeds of a coalition game. Ends up being interesting for feature selection and collective action problems and model explanation. TBC

5 Bargaining and commitment

See commitment for a discussion of bargaining and commitment in the context of game theory.

6 References

Alchian. 1950. “Uncertainty, Evolution, and Economic Theory.” The Journal of Political Economy.
Arthur. 1994. “Inductive Reasoning and Bounded Rationality: The El Farol Problem.” American Economic Review.
Aumann. 1974. “Subjectivity and Correlation in Randomized Strategies.” Journal of Mathematical Economics.
Axelrod. 1984. The evolution of cooperation.
Bednar, and Page. 2000. “Can Game(s) Theory Explain Culture? The Emergence of Cultural Behavior Within Multiple Games.”
Blume. 1993. The Statistical Mechanics of Strategic Interaction.” Games and Economic Behavior.
Brockhurst, Buckling, and Gardner. 2007. “Cooperation Peaks at Intermediate Disturbance.” Current Biology.
Cai, Daskalakis, and Weinberg. 2013. Understanding Incentives: Mechanism Design Becomes Algorithm Design.” arXiv:1305.4002 [Cs].
Castellano, Fortunato, and Loreto. 2009. Statistical Physics of Social Dynamics.” Reviews of Modern Physics.
Cesa-Bianchi, and Lugosi. 2006. Prediction, Learning, and Games.
Chaitin. 1977. “Algorithmic Information Theory.” IBM Journal of Research and Development.
Crawford, and Sobel. 1982. Strategic Information Transmission.” Econometrica: Journal of the Econometric Society.
Daskalakis, Deckelbaum, and Tzamos. 2012a. Optimal Pricing Is Hard.” In Internet and Network Economics.
———. 2012b. The Complexity of Optimal Mechanism Design.” arXiv:1211.1703 [Cs].
———. 2013. Mechanism Design via Optimal Transport.” In.
Durlauf. 1996. “Statistical Mechanics Approaches to Socioeconomic Behavior.”
Fosco, and Mengel. 2010. Cooperation Through Imitation and Exclusion in Networks.” Journal of Economic Dynamics and Control.
Foster, and Young. 2006. “Regret Testing: Learning to Play Nash Equilibrium Without Knowing You Have an Opponent.” Theoretical Economics.
Fox, MacDermott, Hammond, et al. 2023. On Imperfect Recall in Multi-Agent Influence Diagrams.” Electronic Proceedings in Theoretical Computer Science.
Galla, and Farmer. 2011. Complex Dynamics in Learning Complicated Games.” Complex Dynamics in Learning Complicated Games.
Gammerman. 2004. Algorithmic Learning in a Random World.
Gammerman, and Vovk. 2007. Hedging Predictions in Machine Learning.” The Computer Journal.
Greenblatt, Shlegeris, Sachan, et al. 2024. AI Control: Improving Safety Despite Intentional Subversion.”
Hammond, Fox, Everitt, et al. 2023. Reasoning about Causality in Games.” Artificial Intelligence.
Insua, Rios, and Banks. 2009. Adversarial Risk Analysis.” Journal of the American Statistical Association.
Jackson, Matthew O. 2008. Social and Economic Networks.
Jackson, Matthew O. 2011. A Brief Introduction to the Basics of Game Theory.” SSRN Electronic Journal.
Latek, Axtell, and Kaminski. 2009. “Bounded Rationality via Recursion.” In.
Lazaric, and Raybaut. 2004. “Knowledge Creation Facing Hierarchy: The Dynamics of Groups Inside the Firm.” Journal of Artificial Societies and Social Simulation.
Le, and Boyd. 2007. “Evolutionary Dynamics of the Continuous Iterated Prisoner’s Dilemma.” Journal of Theoretical Biology.
Linial. 1994. Game-Theoretic Aspects of Computing.” In Handbook of Game Theory with Economic Applications.
McElreath, and Boyd. 2007. Mathematical Models of Social Evolution: A Guide for the Perplexed.
Mesquita. 2010. The Predictioneer’s Game: Using the Logic of Brazen Self-Interest to See and Shape the Future.
Moral Sentiments and Material Interests: The Foundations of Cooperation in Economic Life. 2006.
Nowak, and Krakauer. 1999. “The Evolution of Language.” Proceedings of the National Academy of Sciences of the United States of America.
Nowak, Plotkin, and Krakauer. 1999. “The Evolutionary Language Game.” Journal of Theoretical Biology.
Ohsawa. 2021. Unbiased Self-Play.” arXiv:2106.03007 [Cs, Econ, Stat].
Ostrom. 1990. Governing the Commons: The Evolution of Institutions for Collective Action (Political Economy of Institutions and Decisions).
Pluchino, Rapisarda, and Garofalo. 2010. The Peter Principle Revisited: A Computational Study.” Physica A: Statistical Mechanics and Its Applications.
Rennard. 2006. Handbook of Research on Nature-Inspired Computing for Economics and Management.
Richards. 2001. Coordination and Shared Mental Models.” American Journal of Political Science.
Roca, Cuesta, and Sánchez. 2006. Time Scales in Evolutionary Dynamics.” Physical Review Letters.
Roughgarden, Tim. 2018. Complexity Theory, Game Theory, and Economics.” arXiv:1801.00734 [Cs, Econ].
Roughgarden, Joan, Oishi, and Akçay. 2006. “Reproductive Social Behavior: Cooperative Games to Replace Sexual Selection.” Science.
Rubinstein. 2000. Economics and Language.
Sadrieh. 1998. The Alternating Double Auction Market: A Game Theoretic and Experimental Investigation (Lecture Notes in Economics and Mathematical Systems).
Sanders, Galla, and Shapiro. 2011. Effects of Noise on Convergent Game Learning Dynamics.” arXiv:1109.4853.
Sato, Akiyama, and Farmer. 2002. Chaos in Learning a Simple Two-Person Game.” Proceedings of the National Academy of Sciences.
Sato, and Crutchfield. 2003. “Coupled Replicator Equations for the Dynamics of Learning in Multiagent Systems.” Physical Review E.
Schotter. 2008. The Economic Theory of Social Institutions.
Sethi, and Somanathan. 1996. The Evolution of Social Norms in Common Property Resource Use.” The American Economic Review.
Shafer, and Vovk. 2001. “Introduction: Probability and Finance as a Game.” In Probability and Finance: It’s Only a Game!
———. 2008. A Tutorial on Conformal Prediction.” Journal of Machine Learning Research.
Slobodkin, and Rapoport. 1974. “An Optimal Strategy of Evolution.” The Quarterly Review of Biology.
Spence. 1973. Job Market Signaling.” The Quarterly Journal of Economics.
———. 2002. Signaling in Retrospect and the Informational Structure of Markets.” American Economic Review.
Tooby, Cosmides, and Price. 2006. Cognitive Adaptations Forn-Person Exchange: The Evolutionary Roots of Organizational Behavior.” Managerial and Decision Economics.
Vincent. 2006. “Carcinogenesis As an Evolutionary Game.” Advances in Complex Systems.
Vovk, Nouretdinov, and Gammerman. 2009. On-Line Predictive Linear Regression.” The Annals of Statistics.
Williams. 1966. The Compleat Strategyst : Being a Primer on the Theory of Games of Strategy.
Wolpert, Harré, Olbrich, et al. 2010. “Hysteresis Effects of Changing Parameters of Noncooperative Games.” SSRN eLibrary.
Wu, Altrock, Wang, et al. 2010. “Universality of Weak Selection.”
Yanagita, and Onozaki. 2008. Dynamics of a Market with Heterogeneous Learning Agents.” Journal of Economic Interaction and Coordination.
Yang, Lin, Wu, et al. 2011. Topological Conditions of Scale-Free Networks for Cooperation to Evolve.” arXiv:1106.5386.
Young. 1996. The Economics of Convention.” The Journal of Economic Perspectives.
———. 1998a. Conventional Contracts.” The Review of Economic Studies.
———. 1998b. Individual Strategy and Social Structure : An Evolutionary Theory of Institutions.
———. 1998c. “Social Norms and Economic Welfare.” European Economic Review.
———. 2002. “The Diffusion of Innovations in Social Networks.”
———. 2005. The Spread of Innovations Through Social Learning.”
———. 2006. “Social Dynamics: Theory and Applications.” Handbook of Computational Economics.