Iterated conversation games
Is arsehat an stable strategy?
2022-07-08 — 2025-03-28
Wherein iterated game theory is applied to online conversation, and status and cheap talk are treated as tradable commodities among online tribes to explain how communication norms are evolved and maintained.
Assumed audience:
Anyone who wonders how to make conversations with strangers nicer. Or, nastier.
Under heavy construction.
I’ve been ruminating on how the principles of iterated game theory can shed light on our communication norms, especially online. I want to explore how these models can help us understand and perhaps improve the ways we converse with one another.
I think I flogged a certain idea to death in tokenism and table stakes. I am not satisfied with the results. We can see a bad repeating pattern in there, but what is the takeaway? How do we avoid it?
Let’s have a crack at seeing that piece, about weird social dynamics, in a bigger framework. Spoiler: I aim to build up a framework we can consider general evolutionary-game-theory type models of how we treat each other.
Disclaimer: I doubt I am the first person to think of this model, but I want to reason it through without a literature review to see where it takes me. Specifically, I want an iterative game theory model of communication norms and movement design, and to explain things like Schelling-Goodharting along the way to my own satisfaction. Maybe also Invasive arguments and coalition dynamics while we are at it, who knows.
There are two pieces in this play: iterated game theory and cheap talk. Plugging these together, I think we can learn something about how we could design our communication style.
The useful point of attack here is that I think this model gives us a means of thinking about ways of speaking not just as right or wrong, polite or rude, but rather, in a way that invites us to think about the effect, side effects, and reactions that those ways of conversing will bring about.
1 Iterated Game Theory
The iterated prisoner’s dilemma (IPD) is a game where players choose to cooperate or defect over multiple rounds. While it’s been criticised for overapplication, I think it has untapped potential in modelling conversations.
2 Cheap Talk
Cheap talk refers to communication between players that doesn’t directly affect the payoffs in a game. In conversations, much of what we say can be seen as cheap talk—it’s not binding but can influence others’ actions. Except that maybe it is not so cheap.
A naïve economist might argue that words are worth nothing, but that doesn’t explain how much time people voluntarily spend on Twitter. It seems the regard, exposure and status we get from playing the game is worth something to us.
3 Conversation as an Iterated Game
Consider each exchange in a conversation as a move in an iterated game. The commodities we’re trading aren’t tangible but involve status, self-esteem, and information. How we choose to “play” in each conversational turn can build or erode trust over time.
4 Strategies in the Wild
4.1 Principle of Charity
This principle suggests that we should interpret others’ statements in the most rational way possible, assuming the best of their intentions. In game terms, it’s akin to starting with cooperation.
4.2 Possible Strategies
- Always Tell the Truth: In a world where everyone is honest, communication is efficient. However, this strategy is vulnerable to exploitation by liars.
 - Always Lie: This leads to a breakdown in communication. If everyone lies, trust evaporates, and interactions become meaningless.
 - Tit-for-Tat: Cooperate first, then mimic your partner’s previous move. This strategy promotes cooperation but punishes defection.
 
5 Evolution of Communication Strategies
Which strategies persist depends on their evolutionary stability.
- Truth-Telling Populations: Efficient but vulnerable to deceitful invasion.
 - Lying Populations: Inefficient and unstable, as mistrust hinders any meaningful exchange.
 - Mixed Strategies: Populations that balance trust with skepticism may be more robust against exploitation.
 
In online spaces, we see these dynamics play out. Norms evolve as users interact, with certain communication styles becoming dominant.
6 Teams, Tribes, and Online Dynamics
Group identities heavily influence communication strategies.
- In-Group Communication: Often more cooperative, leveraging shared norms and trust.
 - Out-Group Communication: Can be more competitive or hostile, as trust is lower.
 
Online platforms amplify these tribal dynamics, sometimes fostering echo chambers or increasing polarisation.
7 Practical Implications
So, why can’t people just be nice? And how can we encourage nicer interactions?
- Understanding Incentives: People respond to the “payoffs” in conversation—social approval, reputation, reciprocity.
 - Designing for Cooperation: Platforms can encourage positive interactions by rewarding cooperation and discouraging negative behaviour.
 
8 What conversation strategies will spread?
The commodities that we trade in conversation are also constrained: status, self-esteem. I think the rarefied, abstracted economy of interpersonal status potentially maps nicely onto the abstracted economy of iterated game theory. In particular, we often seem to behave as if we believe the dynamics of online communication are simpler than those of the abstracted game-theory models.
In particular, I think that iterative game theory models answer questions such as Why can’t people just be nice? and more interestingly for me, how can we persuade people to be nicer? In particular, we can answer questions without resorting to boring, unactionable and shallow analyses such as people are mean, or those people are mean unlike these people, and reason through How can we foster people being nice to one another? Let us get into it.
9 Which strategies prosper depends on which strategies are out there
Strategies need to spread and also maintain.
10 Incoming
- What collective moralities are possible? I think about them as <em>moral orbits</em>.
 
10.1 Word Salad
Short verbal notes I transcribed and copy-pasted here.
So why, by the way, this for me the reason would be that even though these iterated game methods have fallen out of favour when it comes to modelling real economies or real international conflict or whatever, they actually might be unusually effective for modelling social interactions at large on the Internet. In particular, I think that some of the stylised observations that the models give us are actually real insights into what types of communication can prosper in the public sphere and how we can best communicate with one another effectively and maybe even kindly.
What are these novel insights? The first one for me, the big one that made waves when it first became big was the idea that we can’t just think about how things are or how things would work if we could get to some hypothetical new state. In game theory, we have to think about everything as an evolutionary process. We can’t just think about a steady state. We can’t just think about an ideal state. We have to consider both the state that we’re in now, the state that we’d like to get to, how we might get there, and how we would maintain that state once we got there. Some states we might like to get into just aren’t feasible to reach. Some that we could reach, we couldn’t necessarily maintain even if we got there, and that’s much like evolution itself. So if we consider, for example, that it might be nice if all animals learned to be kind to one another, and if the lion were to lie down with the lamb and so on, that would indeed be nice, but that is not a maintainable state for an ecosystem. If you have an ecosystem where there are no predators and everything can learn to give away its defensive mechanisms, then that ecosystem is vulnerable to invasion by predators. So if the lion and the lamb lie down together and both forget how to fight, eventually wolves might turn up and eat them both. So there is a certain degree to which any plausible ecosystem has to trade off between things going great for the participants in that ecosystem and also the ecosystem being robust against being invaded by outsiders. We have to have systems that are in some sense self-maintaining and robust against new invasions from the outside or within.
An interesting corollary to this is that we might also need to be open to the idea that a new strategy and way of communicating with one another might need to adapt over time. So in the classic iterated prisoner’s dilemma
All these examples are very biological. Let’s imagine how we might apply these kinds of concepts to public communication. So a great example of a strategy which would be very high yield, probably if we could attain it, would be that when we are communicating with each other, we always tell the truth. So just imagine if you were communicating with the Internet and never lied. You always said exactly what you meant to the best of your ability. Communicating in such a world would be very simple, and we can imagine this amazingly effective world to live in. There would possibly be some downsides. Maybe you’ll be told whether that stripey shirt really does look good on you or not with a little bit too much honesty, but in general, it would probably be a fairly functional world to operate in. However, this is probably not an evolutionarily stable strategy. In a world where everyone tells the truth all the time, the first person who discovers how to lie will do extremely well. That one person who has the ability to tell untruths will be able to get away with telling all manner of untruths because everyone else will have forgotten how to be suspicious and that there is a need to fact-check the information. So in that sense, always telling the truth is a high-value but unstable strategy. We can imagine the opposite: always lie. If you never tell the truth whatsoever, that is probably a stable strategy, but also with very low surplus, a very bad strategy. In a world where all of us lie all the time, there would be no point in communicating at all. We couldn’t get anything done, so it would be pretty grim to live in. On the other hand, a lone truth-teller, a lone person who is actually capable of communicating, can’t do much in this world. They won’t have anyone else that they can truthfully communicate with, at least individually. We can imagine more complex situations, a society where people tell the truth a lot and a society where people lie a lot. There’s some mixture of truth and mixture of lies in each at that population scale. A population of truth-tellers could function quite well. A population of usual truth-tellers could do quite well, and a population of usual liars would do quite badly. So maybe one individual truth-teller can invade a whole population of liars, but if you have two societies, one with a stronger norm of truth and another, maybe the society with a stronger norm of truth will do quite well, but maybe if there are enough liars to keep them on their toes, they will still be robust against invasions by populations of liars. So this kind of complicated dynamic is the kind of thing we expect in an iterated game theory model.
That last example shows one of, I think, the key insights for iterative games theory. In it, we can naturally think about competing populations of different strategies. We can be very general about this purely abstract theory. Let’s think about a phenomenon observed in particular on the modern Internet, which is if we have different cultures and different subcultures with different communication strategies, both inside their groups and outward-facing. In this context, we might wonder if the communication strategies of one particular group can be copied and expand to spread throughout the entire population, or whether they can achieve some success, or whether they will naturally self-limit. All these kinds of questions we might ask about communication strategies in the context of thinking about how they compete with one another on the Internet.
10.2 Links
Köster et al. (2022):
How do societies learn and maintain social norms? Here we use multiagent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviours. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behaviour is punished by other agents. The taboo helps overcome a credit assignment problem in discovering delayed health effects. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This “silly rule” counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. By probing what individual agents have learned, we demonstrate that normative behaviour relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Our results highlight the benefit of employing a multiagent reinforcement learning computational model focused on learning to implement complex actions.
