Iterated conversation games

Is arsehat an stable strategy?

2022-07-08 — 2025-03-28

Wherein iterated game theory is applied to online conversation, and status and cheap talk are treated as tradable commodities among online tribes to explain how communication norms are evolved and maintained.

communicating

cooperation

culture

economics

evolution

game theory

mind

wonk

Assumed audience:

Anyone who wonders how to make conversations with strangers nicer. Or, nastier.

Under heavy construction.

I’ve been ruminating on how the principles of iterated game theory can shed light on our communication norms, especially online. I want to explore how these models can help us understand and perhaps improve the ways we converse with one another.

I think I flogged a certain idea to death in tokenism and table stakes. I am not satisfied with the results. We can see a bad repeating pattern in there, but what is the takeaway? How do we avoid it?

Let’s have a crack at seeing that piece, about weird social dynamics, in a bigger framework. Spoiler: I aim to build up a framework we can consider general evolutionary-game-theory type models of how we treat each other.

Disclaimer: I doubt I am the first person to think of this model, but I want to reason it through without a literature review to see where it takes me. Specifically, I want an iterative game theory model of communication norms and movement design, and to explain things like Schelling-Goodharting along the way to my own satisfaction. Maybe also Invasive arguments and coalition dynamics while we are at it, who knows.

There are two pieces in this play: iterated game theory and cheap talk. Plugging these together, I think we can learn something about how we could design our communication style.

The useful point of attack here is that I think this model gives us a means of thinking about ways of speaking not just as right or wrong, polite or rude, but rather, in a way that invites us to think about the effect, side effects, and reactions that those ways of conversing will bring about.

1 Iterated Game Theory

The iterated prisoner’s dilemma (IPD) is a game where players choose to cooperate or defect over multiple rounds. While it’s been criticised for overapplication, I think it has untapped potential in modelling conversations.

2 Cheap Talk

Cheap talk refers to communication between players that doesn’t directly affect the payoffs in a game. In conversations, much of what we say can be seen as cheap talk—it’s not binding but can influence others’ actions. Except that maybe it is not so cheap.

A naïve economist might argue that words are worth nothing, but that doesn’t explain how much time people voluntarily spend on Twitter. It seems the regard, exposure and status we get from playing the game is worth something to us.

3 Conversation as an Iterated Game

Consider each exchange in a conversation as a move in an iterated game. The commodities we’re trading aren’t tangible but involve status, self-esteem, and information. How we choose to “play” in each conversational turn can build or erode trust over time.

4 Strategies in the Wild

4.1 Principle of Charity

This principle suggests that we should interpret others’ statements in the most rational way possible, assuming the best of their intentions. In game terms, it’s akin to starting with cooperation.

See the principle of charity.

4.2 Possible Strategies

Always Tell the Truth: In a world where everyone is honest, communication is efficient. However, this strategy is vulnerable to exploitation by liars.
Always Lie: This leads to a breakdown in communication. If everyone lies, trust evaporates, and interactions become meaningless.
Tit-for-Tat: Cooperate first, then mimic your partner’s previous move. This strategy promotes cooperation but punishes defection.

5 Evolution of Communication Strategies

Which strategies persist depends on their evolutionary stability.

Truth-Telling Populations: Efficient but vulnerable to deceitful invasion.
Lying Populations: Inefficient and unstable, as mistrust hinders any meaningful exchange.
Mixed Strategies: Populations that balance trust with skepticism may be more robust against exploitation.

In online spaces, we see these dynamics play out. Norms evolve as users interact, with certain communication styles becoming dominant.

6 Teams, Tribes, and Online Dynamics

Group identities heavily influence communication strategies.

In-Group Communication: Often more cooperative, leveraging shared norms and trust.
Out-Group Communication: Can be more competitive or hostile, as trust is lower.

Online platforms amplify these tribal dynamics, sometimes fostering echo chambers or increasing polarisation.

7 Practical Implications

So, why can’t people just be nice? And how can we encourage nicer interactions?

Understanding Incentives: People respond to the “payoffs” in conversation—social approval, reputation, reciprocity.
Designing for Cooperation: Platforms can encourage positive interactions by rewarding cooperation and discouraging negative behaviour.

8 What conversation strategies will spread?

The commodities that we trade in conversation are also constrained: status, self-esteem. I think the rarefied, abstracted economy of interpersonal status potentially maps nicely onto the abstracted economy of iterated game theory. In particular, we often seem to behave as if we believe the dynamics of online communication are simpler than those of the abstracted game-theory models.

In particular, I think that iterative game theory models answer questions such as Why can’t people just be nice? and more interestingly for me, how can we persuade people to be nicer? In particular, we can answer questions without resorting to boring, unactionable and shallow analyses such as people are mean, or those people are mean unlike these people, and reason through How can we foster people being nice to one another? Let us get into it.

9 Which strategies prosper depends on which strategies are out there

Strategies need to spread and also maintain.

10 Incoming

What collective moralities are possible? I think about them as <em>moral orbits</em>.

10.1 Word Salad

Short verbal notes I transcribed and copy-pasted here.

So why, by the way, this for me the reason would be that even though these iterated game methods have fallen out of favour when it comes to modelling real economies or real international conflict or whatever, they actually might be unusually effective for modelling social interactions at large on the Internet. In particular, I think that some of the stylised observations that the models give us are actually real insights into what types of communication can prosper in the public sphere and how we can best communicate with one another effectively and maybe even kindly.

What are these novel insights? The first one for me, the big one that made waves when it first became big was the idea that we can’t just think about how things are or how things would work if we could get to some hypothetical new state. In game theory, we have to think about everything as an evolutionary process. We can’t just think about a steady state. We can’t just think about an ideal state. We have to consider both the state that we’re in now, the state that we’d like to get to, how we might get there, and how we would maintain that state once we got there. Some states we might like to get into just aren’t feasible to reach. Some that we could reach, we couldn’t necessarily maintain even if we got there, and that’s much like evolution itself. So if we consider, for example, that it might be nice if all animals learned to be kind to one another, and if the lion were to lie down with the lamb and so on, that would indeed be nice, but that is not a maintainable state for an ecosystem. If you have an ecosystem where there are no predators and everything can learn to give away its defensive mechanisms, then that ecosystem is vulnerable to invasion by predators. So if the lion and the lamb lie down together and both forget how to fight, eventually wolves might turn up and eat them both. So there is a certain degree to which any plausible ecosystem has to trade off between things going great for the participants in that ecosystem and also the ecosystem being robust against being invaded by outsiders. We have to have systems that are in some sense self-maintaining and robust against new invasions from the outside or within.

An interesting corollary to this is that we might also need to be open to the idea that a new strategy and way of communicating with one another might need to adapt over time. So in the classic iterated prisoner’s dilemma

All these examples are very biological. Let’s imagine how we might apply these kinds of concepts to public communication. So a great example of a strategy which would be very high yield, probably if we could attain it, would be that when we are communicating with each other, we always tell the truth. So just imagine if you were communicating with the Internet and never lied. You always said exactly what you meant to the best of your ability. Communicating in such a world would be very simple, and we can imagine this amazingly effective world to live in. There would possibly be some downsides. Maybe you’ll be told whether that stripey shirt really does look good on you or not with a little bit too much honesty, but in general, it would probably be a fairly functional world to operate in. However, this is probably not an evolutionarily stable strategy. In a world where everyone tells the truth all the time, the first person who discovers how to lie will do extremely well. That one person who has the ability to tell untruths will be able to get away with telling all manner of untruths because everyone else will have forgotten how to be suspicious and that there is a need to fact-check the information. So in that sense, always telling the truth is a high-value but unstable strategy. We can imagine the opposite: always lie. If you never tell the truth whatsoever, that is probably a stable strategy, but also with very low surplus, a very bad strategy. In a world where all of us lie all the time, there would be no point in communicating at all. We couldn’t get anything done, so it would be pretty grim to live in. On the other hand, a lone truth-teller, a lone person who is actually capable of communicating, can’t do much in this world. They won’t have anyone else that they can truthfully communicate with, at least individually. We can imagine more complex situations, a society where people tell the truth a lot and a society where people lie a lot. There’s some mixture of truth and mixture of lies in each at that population scale. A population of truth-tellers could function quite well. A population of usual truth-tellers could do quite well, and a population of usual liars would do quite badly. So maybe one individual truth-teller can invade a whole population of liars, but if you have two societies, one with a stronger norm of truth and another, maybe the society with a stronger norm of truth will do quite well, but maybe if there are enough liars to keep them on their toes, they will still be robust against invasions by populations of liars. So this kind of complicated dynamic is the kind of thing we expect in an iterated game theory model.

That last example shows one of, I think, the key insights for iterative games theory. In it, we can naturally think about competing populations of different strategies. We can be very general about this purely abstract theory. Let’s think about a phenomenon observed in particular on the modern Internet, which is if we have different cultures and different subcultures with different communication strategies, both inside their groups and outward-facing. In this context, we might wonder if the communication strategies of one particular group can be copied and expand to spread throughout the entire population, or whether they can achieve some success, or whether they will naturally self-limit. All these kinds of questions we might ask about communication strategies in the context of thinking about how they compete with one another on the Internet.

10.2 Links

The Evolution of Trust
Köster et al. (2022):

How do societies learn and maintain social norms? Here we use multiagent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviours. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behaviour is punished by other agents. The taboo helps overcome a credit assignment problem in discovering delayed health effects. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This “silly rule” counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. By probing what individual agents have learned, we demonstrate that normative behaviour relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Our results highlight the benefit of employing a multiagent reinforcement learning computational model focused on learning to implement complex actions.

11 References

Axelrod. 1984. The evolution of cooperation.

Boyd, and Richerson. 1990. “Group Selection Among Alternative Evolutionarily Stable Strategies.” Journal of Theoretical Biology.

Bruner, Justin P. 2021. “Cooperation, Correlation and the Evolutionary Dominance of Tag-Based Strategies.” Biology & Philosophy.

Bruner, Justin, and O’Connor. 2017. “Power, Bargaining, and Collaboration.” In Power, Bargaining, and Collaboration.

Cai, Daskalakis, and Weinberg. 2013. “Understanding Incentives: Mechanism Design Becomes Algorithm Design.” arXiv:1305.4002 [Cs].

Castellano, Fortunato, and Loreto. 2009. “Statistical Physics of Social Dynamics.” Reviews of Modern Physics.

Choi, and Bowles. 2007. “The Coevolution of Parochial Altruism and War.” Science.

Chong, and Yao. 2005. “Behavioral Diversity, Choices and Noise in the Iterated Prisoner’s Dilemma.” IEEE Transactions on Evolutionary Computation.

Cochran, and O’Connor. 2019. “Inequality and Inequity in the Emergence of Conventions.” Politics, Philosophy and Economics.

Cohen. 2012. “The Evolution of Tag-Based Cooperation in Humans: The Case for Accent.” Current Anthropology.

Dawkins. 1980. “Good Strategy or Evolutionarily Stable Strategy?” In Sociobiology: Beyond Nature/Nurture?

Dowding. 2016. “Albert O. Hirschman, Exit, Voice and Loyalty: Responses to Decline in Firms, Organizations, and States.” In Albert O. Hirschman,.

Fosco, and Mengel. 2010. “Cooperation Through Imitation and Exclusion in Networks.” Journal of Economic Dynamics and Control.

Friedman. 2000. “A Guided Tour of the Folk Theorem.” In Market Structure and Competition Policy: Game-Theoretic Approaches.

Hales. 2005. “Change Your Tags Fast! – A Necessary Condition for Cooperation?” In Multi-Agent and Multi-Agent-Based Simulation. Lecture Notes in Computer Science.

Hao, and Leung. 2011. “Learning to Achieve Social Rationality Using Tag Mechanism in Repeated Interactions.” In 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

Harley. 1981. “Learning the Evolutionarily Stable Strategy.” Journal of Theoretical Biology.

Hauert, Monte, Hofbauer, et al. 2002. “Volunteering as Red Queen Mechanism for Cooperation in Public Goods Games.” Science.

Henrich, and Boyd. 2001. “Why People Punish Defectors: Weak Conformist Transmission Can Stabilize Costly Enforcement of Norms in Cooperative Dilemmas.” Journal of Theoretical Biology.

Hetzer, and Sornette. 2013. “An Evolutionary Model of Cooperation, Fairness and Altruistic Punishment in Public Good Games.” PLoS ONE.

Hirschman. 1970. Exit, Voice, and Loyalty: Responses to Decline in Firms, Organizations, and States.

Iannaccone. 1994. “Why Strict Churches Are Strong.” American Journal of Sociology.

Jackson. 2008. Social and Economic Networks.

Köster, Hadfield-Menell, Everett, et al. 2022. “Spurious Normativity Enhances Learning of Compliance and Enforcement Behavior in Artificial Agents.” Proceedings of the National Academy of Sciences.

Le, and Boyd. 2007. “Evolutionary Dynamics of the Continuous Iterated Prisoner’s Dilemma.” Journal of Theoretical Biology.

Matlock, and Sen. 2007. “Effective Tag Mechanisms for Evolving Coordination.” In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. AAMAS ’07.

McAvity, Bristow, Bunker, et al. 2013. “Perception Without Self-Matching in Conditional Tag Based Cooperation.” Journal of Theoretical Biology.

McElreath, and Boyd. 2007. Mathematical Models of Social Evolution: A Guide for the Perplexed.

Mohseni, O’Connor, and Rubin. 2019. “On the Emergence of Minority Disadvantage: Testing the Cultural Red King Hypothesis.” Synthese.

Nowak. 2006. “Five Rules for the Evolution of Cooperation.” Science.

O’Connor. 2017. “The Cultural Red King Effect.” The Journal of Mathematical Sociology.

———. 2019. The Origins of Unfairness: Social Categories and Cultural Evolution.

———. 2020. Games in the Philosophy of Biology.

Oh. 2001. “Promoting Cooperation Using `Kin’ Biased Conditional Strategy in the Iterated Prisoner’s Dilemma Game.” Information Sciences, Evolutionary Algorithms,.

Olson. 2009. The Logic of Collective Action: Public Goods and the Theory of Groups.

Ostrom. 1990. Governing the Commons: The Evolution of Institutions for Collective Action (Political Economy of Institutions and Decisions).

Phelps. 2013. “Emergence of Social Networks via Direct and Indirect Reciprocity.” Autonomous Agents and Multi-Agent Systems.

Phelps, Nevarez, and Howes. 2011. “The Effect of Group Size and Frequency-of-Encounter on the Evolution of Cooperation.” In Advances in Artificial Life. Darwin Meets von Neumann. Lecture Notes in Computer Science.

Rapoport, Seale, and Colman. 2015. “Is Tit-for-Tat the Answer? On the Conclusions Drawn from Axelrod’s Tournaments.” PLOS ONE.

Richards. 2001. “Coordination and Shared Mental Models.” American Journal of Political Science.

Rubin, and O’Connor. 2018. “Discrimination and Collaboration in Science.” Philosophy of Science.

Sanders, Galla, and Shapiro. 2011. “Effects of Noise on Convergent Game Learning Dynamics.” arXiv:1109.4853.

Sato, and Crutchfield. 2003. “Coupled Replicator Equations for the Dynamics of Learning in Multiagent Systems.” Physical Review E.

Selten. 1988. “A Note on Evolutionarily Stable Strategies in Asymmetric Animal Conflicts.” In Models of Strategic Rationality. Theory and Decision Library C.

Sethi, and Somanathan. 1996. “The Evolution of Social Norms in Common Property Resource Use.” The American Economic Review.

Spence. 2002. “Signaling in Retrospect and the Informational Structure of Markets.” American Economic Review.

Taiwo. 2022. “Vice Signaling.” Journal of Ethics and Social Philosophy.

Talhelm, and Dong. 2024. “People Quasi-Randomly Assigned to Farm Rice Are More Collectivistic Than People Assigned to Farm Wheat.” Nature Communications.

Taylor, and Jonker. 1978. “Evolutionary Stable Strategies and Game Dynamics.” Mathematical Biosciences.

Tooby, Cosmides, and Price. 2006. “Cognitive Adaptations Forn-Person Exchange: The Evolutionary Roots of Organizational Behavior.” Managerial and Decision Economics.

Vinitsky, Köster, Agapiou, et al. 2023. “A Learning Agent That Acquires Social Norms from Public Sanctions in Decentralized Multi-Agent Settings.” Collective Intelligence.

Wolpert, Harré, Olbrich, et al. 2010. “Hysteresis Effects of Changing Parameters of Noncooperative Games.” SSRN eLibrary.

Wu, Altrock, Wang, et al. 2010. “Universality of Weak Selection.”

Yang, Lin, Wu, et al. 2011. “Topological Conditions of Scale-Free Networks for Cooperation to Evolve.” arXiv:1106.5386.

Young. 1996. “The Economics of Convention.” The Journal of Economic Perspectives.

———. 1998a. Individual Strategy and Social Structure : An Evolutionary Theory of Institutions.

———. 1998b. “Social Norms and Economic Welfare.” European Economic Review.

———. 2006. “Social Dynamics: Theory and Applications.” Handbook of Computational Economics.