Learning from the madness of crowds

God help me I need to extract truth from the internet

2026-04-19 — 2026-06-15

Wherein the Surprisingly Popular Algorithm Is Examined as a Model for Extracting Minority-Correct Beliefs From Biased Informants, Alongside Market-Based Mechanisms as Alternatives to Passive Corpus Learning.

adaptive

agents

bounded compute

collective knowledge

communicating

cooperation

democracy

distributed

economics

game theory

how do science

incentive mechanisms

institutions

mind

networks

provenance

sociology

standards

virality

Starting questions.

1 Why do I believe things?

Figure 2: My belief $B_{\text{dan}}$, drawn not as a passive variable but as a *policy* — the filled node of the mechanised convention, an agenty rule for what to conclude and what to do about it. Two channels feed it. On the left I observe the world, recording two cause-and-effect pairs $C_i \to E_i$ (solid, since the causation is out there) that inform my belief (dashed). On the right, Alice and Bob run belief-policies built like mine, and their conclusions flow into mine along the dash-dotted edges — docility, importing a neighbour’s answer rather than re-deriving it, and the back door by which other minds reach my belief. Then the belief does work: it drives a decision $C_3$, a rectangle because — unlike the observed causes $C_1, C_2$ — this one I get to *set*, the intervention $\mathrm{do}(C_3)$, chosen to bring about a desired effect $E_3$ whose worth to me is the utility $U$.

1.1 Certainty

I like to claim to be a Good Bayesian and thus I never Believe Things with certainty. No, I say, my mind is too well schooled in the arts of Optimal Updating, and I am too steeped in the arts of subjective probability to do any of that certainty business. No, I believe nothing with iron-clad certainty; rather I hold all hypotheses in mind, weighted by my current prior probability. Except, of course, the mechanisms of Bayesian updating itself.

This is of course bollocks and some things I treat as if they were true. I believe in gravity, and the existence of apples, and so on. I might conceivably be shown, eventually, to be wrong about these things, but I will be totally surprised and will have made no long-shot contingency plans against the absence of gravity or apples; not so much as a small amulet with the face of Newton to bless me in my travails.

TODO: Hyland and Albarracin (2025).

2 Slop lit review

What follows is a lightly preened AI lit review of formalisms that capture aspects of this problem, organised by how directly they address the question: how can a learner extract a good world model from data generated by strategic, biased, or adversarial agents?

2.1 Crowdsourcing models

Due to Dawid-Skene and descendants.

In the classical setup, there is a latent ground truth label, and $n$ noisy annotators each report with an unknown confusion matrix. There exist methods (expectation-maximisation) to estimate these matrices; spectral methods (Zhang et al. 2016) can identify annotator quality without ground truth. The limitation for our purposes is that Dawid-Skene-type methods assume all annotators label the same items. Internet authors write about different things. Still, the mathematical machinery — latent truth, heterogeneous noise, identification via redundancy — might be useful.

2.2 Robust statistics

We can estimate the mean of a distribution even when an $\varepsilon$-fraction of samples are adversarially corrupted, as long as the clean distribution has bounded moments. This is because, in high dimensions, adversarial corruption distorts the covariance in detectable ways. Specifically, it introduces “spurious” eigenvalues, which we can then filter out. By analogy we might hope that even if $\varepsilon$ of the internet is adversarially generated, if the “clean” distribution over text had enough structure, we could learn through the noise. The book to read is Diakonikolas & Kane’s Algorithmic High-Dimensional Robust Statistics (Diakonikolas and Kane 2023).

This is an analogy only, once again. We would need other things to make it do useful things. For example, what are the “bounded moments” assumptions in the text setting? What does “adversarial corruption” even mean when the data has sequential structure? 🏗

2.2.1 Peer prediction and Bayesian Truth Serum

As noted in Bayesian epistemics, we can hope to extract truth from crowds by thinking about incentives, Bayes, and meta-knowledge. Key works here are Prelec (2004) and descendants (Witkowski and Parkes 2012; Miller, Resnick, and Zeckhauser 2005). The “surprisingly popular” (SP) algorithm seems like a good start: it extracts truth from crowds by finding answers that are more popular than predicted — exploiting meta-knowledge. Concretely: people who hold the correct-but-minority view often know their view is rare, so they predict lower support for it from everyone else. The SP algorithm uses this gap between actual and predicted popularity as a signal.

The SP algorithm works because it exploits the gap between first-order beliefs (what I think) and second-order beliefs (what I think others think). Does the structure of the internet corpus contain enough second-order information to support something like this? Blog posts often respond to other positions, explicitly modelling what “they” believe. This meta-discursive structure might be informative in the SP sense. See also Collina et al. (2025) — collaborative prediction via agreement, where two parties with different features arrive at better predictions by iteratively sharing only their predictions (not their data). The internet might be doing a massively distributed version of this, poorly.

An LLM trained on diverse sources might do something analogous, finding coherent patterns that show up more than we’d expect if the corpus were just noise. Speculatively, might not the mathematical structure — using the surprise in the distribution, not just the mode — connect to information-theoretic ideas about compressibility?

2.3 Performative prediction

Predictions that change the distribution they predict. Relevant because: internet content is generated in response to existing discourse. An LLM trained on internet text is predicting a reflexive process. The performative prediction framework gives conditions under which retraining converges to a stable fixed point despite this feedback loop. The limitation: performative prediction is about a single predictor affecting a population, not about extracting signal from a population of strategic reporters.

2.4 Multi-agent inverse reinforcement learning

If the corpus consists of demonstrations from agents with heterogeneous (unknown) reward functions, next-token prediction is a kind of multi-agent IRL. We’re trying to infer… what? A shared world model? An aggregated reward function? The reward decomposition question matters: is there a “common world model” component that all agents share (physics, grammar, entity knowledge) plus agent-specific “bias” components (goals, political views, persuasive intent)? If so, can we identify conditions under which the common component is recoverable? Yu, Song, and Ermon (2019) propose adversarial multi-agent IRL but the text-corpus setting hasn’t been addressed, AFAIK. 🏗

2.5 Data Shapley and data valuation

(Ghorbani2019Data?) (“Data Shapley”) and descendants. Which data points contribute what to model capabilities? This connects to “what are the votes for.” Recent work applies Shapley values specifically to LLM fine-tuning data. The conceptual contribution: treating data points as players in a cooperative game, where the “value” of each is its marginal contribution to model performance. But this is retrospective (evaluating existing data) rather than generative (telling us how the learner extracts signal).

2.6 The “internet as latent mixture model”

More carefully: model each document as generated by first sampling a “type” (topic $\times$ intent $\times$ competence $\times$ honesty), then generating text conditionally. The LLM learns the full joint. At inference time, conditioning on a well-crafted prompt (e.g. “a careful, accurate, well-sourced explanation of X”) selects from the subpopulation of types that would have produced such a prompt, which (if the type space is rich enough) filters out the biased types. This is just Bayesian conditioning in the latent type space. It explains why instruction-tuning and RLHF work at all: they’re shifting the posterior over types toward “helpful, honest, harmless.” But what determines whether the latent type space has enough resolution to make this filtering effective? 🏗

2.7 Free energy and active inference multi-agent models.

Hyland et al. (2024) proposes modelling interactions between boundedly-rational agents as free-energy equilibria. Walters et al. (2025) applies free-energy risk metrics to multi-agent systems. The active-inference framing says each agent minimises its own free energy (surprise), and the question becomes: under what conditions does a collection of surprise-minimising agents produce an aggregate that is itself low-surprise in a useful sense? Crutchfield and Jurgens (2025) on “agentic information theory” might be relevant here — intrinsic semantics of information processes. This feels potentially deep but I haven’t worked through the maths. 🏗

3 What might we actually build?

If I’m thinking like an ML person about what’s buildable, the following seems tractable:

Empirically test the “compression as debiasing” hypothesis. The conjecture: more compression pressure (larger models) recovers cleaner signal from a biased corpus — networks fit compressible structure before noise (Arpit2017Closer?), and language modelling is compression (Deletang2024Languagea?). But (Krestnikov2026Truth?) runs essentially this experiment and finds the catch: gradient descent favours the most compressible answer cluster, not truth — recovery scales with model size against random corruption (65→85%) but collapses to chance against a coherent alternative rule system. So compression debiases incompressible noise, not a structured adversary — exactly the strategic-corruption case we care about. Concretely: inject varying fractions of random vs. coherent misinformation into a factual QA corpus, measure recovery as a function of model scale and corruption type.
Latent-type inference. Train a model to infer the “type” of a document’s author (or at least a useful projection of it), then study whether conditioning on inferred type allows better extraction of factual content. This connects to the Dawid-Skene line but with the Bayesian conditioning approach in the mixture-model view.
Robust aggregation over LLM representations. Use robust mean estimation techniques (Diakonikolas and Kane 2023) on the internal representations of an LLM processing different documents about the same topic. If the representations of biased documents are “corrupted” relative to the factual ones, robust aggregation should recover the factual representation.
SP-inspired probing. For claims in the LLM’s training data, estimate both the LLM’s “belief” (probability it assigns to the claim being true) and its “meta-belief” (what it predicts other sources would say). The gap between these is the “surprisingly popular” signal, and might identify areas where the LLM has extracted minority-but-correct information.

4 The prescriptive reframing: active learners, not passive consumers

The descriptive question (“how does an LLM accidentally debias?”) might be the wrong entry point. It invites post-hoc rationalisation of an opaque process. The prescriptive question is better-formed: if we were designing agents that must learn from each other’s outputs in a strategic/adversarial information environment, what mechanisms should they use?

The shift from “passive learner consuming a static corpus” to “active learner in a population of strategic agents” changes the problem structure. The learner isn’t just filtering noise — the learner’s actions (queries, publications, bets) change what other agents produce. And the learner can design its interactions to extract more information.

4.1 Key positive and negative results

Negative:

Speed of learning in network equilibria is bounded by a constant depending only on private signal distributions, independent of network size (Huang, Strack, and Tamuz 2024). As networks grow, almost all private information is lost. Naive information sharing fails even at scale. This is a strong constraint on any “just let agents talk” approach.
Arrow/Gibbard-Satterthwaite: aggregation mechanisms are either dictatorial or manipulable.
Information aggregation under ambiguity (Galanis, Ioannou, and Kotronis 2024) — ambiguity aversion distorts aggregation even in designed mechanisms.

Positive:

Prediction markets converge to rational expectations under certain conditions. The prices are the aggregation.
Proper scoring rules (Gneiting and Raftery 2007) incentivise truthful reporting, and connect to Bregman divergences.
Surprisingly popular (Prelec 2004; Prelec, Seung, and McCoy 2017) extracts minority-correct answers using meta-knowledge.
Robust statistics can handle $\varepsilon$-corruption if the clean distribution has structure.
Collina et al. (2025): collaborative prediction via iterated agreement, sharing only predictions (not features), with communication cost independent of data dimensionality.

4.2 Formalisms opened up by the prescriptive framing

Prediction markets as the canonical case. The cleanest existing mechanism for extracting beliefs from strategic agents. A proper scoring rule incentivises truthful reporting; a prediction market aggregates via prices. Limitation everyone knows: thin markets, manipulation, subsidisation costs. But the mathematical structure — agents reveal through bets, the mechanism aggregates — is the prescriptive framing in its purest form. The question: can we design something market-like for general knowledge extraction rather than just binary event prediction? Sudhir and Tran-Thanh (2025) on market-based architectures in RL is suggestive. Olckers and Walsh (2024) surveys what goes wrong.

Opponent shaping (LOLA, M-FOS, COLA). Foerster et al.’s “Learning with Opponent-Learning Awareness” and descendants. Instead of best-responding to other agents’ current strategies, model how they’ll adapt to ours, and optimise for the resulting trajectory. If our learner queries/interacts with biased agents repeatedly, it should account for how those agents will respond to being queried. A naive learner that just asks questions gets gamed; an opponent-shaping learner can steer the interaction toward more informative equilibria. The connection holds if we think of the learner as playing a repeated game against information sources with incentives to mislead.

Bayesian persuasion, inverted. Kamenica & Gentzkow’s framework (introduced in Bayesian epistemics as the dual of truth-elicitation) has the sender designing an information structure to influence a receiver. Our learner is the receiver trying to extract information despite the sender’s strategic design. Understanding the sender’s optimal strategy tells us what the worst-case bias structure looks like.¹ This is useful because it gives us the adversary’s optimal play, which is exactly what we need to design against.

Information design meets mechanism design. S. Chen et al. (2023): proper scoring rules meet principal-agent models. Altman and Tennenholtz (2007): incentive-compatible ranking systems. Y. Chen and Yu (2024): scoring rule design under partial knowledge. The prescriptive question: design a protocol where agents’ self-interested behaviour is the information extraction mechanism. Carey et al. (2025) on incentives for responsiveness and instrumental control is relevant too — what incentives does the mechanism create for the agents being queried?

Kalai and Lehrer (1993): rational learning leads to Nash equilibrium. Agents who update beliefs rationally will converge to equilibrium play. The question is whether learning about source reliability converges fast enough to be useful before the information environment shifts.

4.3 The implicit market hypothesis

The internet isn’t a designed mechanism, but it has market-like properties. Agents compete for attention (a kind of currency). If we could design a better attention-allocation mechanism — one that rewards informativeness rather than engagement — we’d have a prediction-market-like structure for general knowledge. This is, IMO, the most buildable version of the prescriptive research programme: take the mathematical structure of information markets (proper scoring rules, market makers, sequential trade) and adapt it to the problem of producing reliable knowledge from distributed agents with heterogeneous incentives.

What makes this hard: in a prediction market, there’s an eventual ground truth (did the event happen?). For general knowledge, the “resolution” mechanism is unclear. Dasgupta and Ghosh (2013) on crowdsourced judgement elicitation with endogenous proficiency and Carvalho (2010) on sharing rewards based on subjective opinions address this gap — mechanisms for eliciting and aggregating beliefs where no resolution event exists.

4.4 What would we build? (prescriptive version)

Market-based knowledge aggregation protocol. Design a mechanism where agents submit probabilistic claims and are scored by a rule that doesn’t require ground truth (the peer-prediction family — see Bayesian epistemics — and Miller, Resnick, and Zeckhauser (2005), Witkowski and Parkes (2012)). Study whether the mechanism produces better calibrated beliefs than unstructured information sharing. This is an experiment, not a product — but it has the shape of something that could become a product.
Opponent-shaping query strategies. Implement a learner that models source incentives and adapts its queries to extract maximum information. Compare against naive querying on a simulated population of strategic agents with known bias structures. The LOLA/COLA machinery provides the gradient-based update rule; the question is whether it helps in an information-extraction setting rather than the usual cooperation/competition settings.
Robust aggregation with reputation. Combine robust mean estimation (Diakonikolas and Kane 2023) with an online reputation system (bandit-like credibility tracking). Each round: query sources, aggregate robustly, update reputations. Study convergence rates as a function of corruption fraction and reputation learning speed.
Information market for LLM alignment. Instead of RLHF’s Borda-ish aggregation, implement a market mechanism where annotators “bet” on which response is better, with payoffs determined by peer prediction. Compare alignment quality to standard RLHF. This connects back to Analogy A above but with the prescriptive mechanism-design lens.

4.5 Garrabrant’s logical induction as a prototype

Garrabrant et al. (2020) proposes: treat logical truths as things we have credences over, and let a market of “traders” (each polynomial-time computable) bid on logical sentences. A sentence’s stock is worth $1 if true, $0 otherwise. The logical induction criterion: no poly-time trading strategy with finite risk tolerance earns unbounded profit. A computable algorithm satisfying this criterion exists, and its prices converge to truth, become coherent (obey probability axioms), and learn to predict patterns of truth “often long before having the resources to evaluate the statements, so long as the patterns can be written down in polynomial time.”

The load-bearing insight: different traders specialise in detecting different patterns. The market aggregates their heterogeneous computational capabilities through prices. This is a division of computational labour, not just informational labour. The ensemble can collectively track truths that no individual trader can verify.

4.6 Simon’s docility as the demand side

Herbert Simon’s concept of docility (from Administrative Behavior (Herbert A. Simon 1997) and later works (Herbert A. Simon 1990)): agents who are computationally bounded rationally accept influence from their social environment because figuring everything out from scratch is too expensive. Docility is adaptive under bounded compute — importing a peer’s conclusion is cheaper than re-deriving it. The failure mode: docility makes agents manipulable. The design question: when is it rational to be docile, and toward whom?

4.7 The allocation problem

Smashing these together: agents are bounded in both information (they see different parts of the world) and compute (they can verify different logical consequences). Each agent has comparative advantage in certain inferences. The question: when should I reason for myself and when should I import a peer’s conclusion?

This is structurally an explore-exploit tradeoff over epistemic actions:

“Explore” = do my own reasoning/observation (costly, first-hand, reliable)
“Exploit” = adopt a peer’s conclusion (cheap, second-hand, uncertain reliability)

The optimal policy depends on uncertainty about the peer’s reliability, the cost of verification, and the value of getting the right answer. But it’s richer than a standard bandit because:

Verifying one claim gives information about the peer’s reliability on other claims (correlated types)
The peer’s conclusions aren’t independent of ours — if we publish our conclusions, the peer updates (reflexivity)
The verification itself consumes compute that could have been spent on other claims (opportunity cost in a shared budget)

4.9 The synthesis I think is missing

No existing formalism simultaneously models agents who are:

Bounded in compute (can verify only some claims)
Bounded in information (see only part of the world)
Strategic (may benefit from misleading others)
Embedded in a network (hear from some peers, not all)
Making allocation decisions (how much effort to verify vs. import)

Garrabrant handles 1 and 4 but assumes honest traders. Prediction markets handle 3 but assume unbounded compute and a single resolved event. Rational inattention handles 1-2 but assumes a single agent. Division-of-cognitive-labour models handle 1-2 and 4-5 but typically assume non-strategic agents.

The closest compound might be: a Garrabrant-style market with strategic traders, partial information, and a network structure — where the “logical sentences” are replaced by empirical claims about the world, and traders can both observe and compute, with both being costly.

4.10 What would we build? (bounded-compute version)

Verification allocation game. Design a simulated environment: $n$ agents, each can observe $k$ facts and verify $m$ inferences per round. Agents can also read peers’ published conclusions at low cost. Study: what allocation strategies (how much to observe vs. verify vs. import) lead to best aggregate knowledge? What mechanisms (market prices, reputation scores, peer prediction) best incentivise efficient allocation?
Strategic logical induction. Extend Garrabrant’s framework with strategic traders who can profit from misleading the market. Study whether the logical induction criterion can be approximately maintained, or whether adversarial traders can permanently distort prices. The key parameter: what fraction of adversarial traders can the market tolerate?
Docility-as-bandit. Model each agent’s trust in each peer as a bandit arm. Pulling the arm = importing the peer’s conclusion. Reward = whether the conclusion turns out to be consistent/useful (a proxy for truth, since ground truth is unavailable). Study convergence of trust estimates and aggregate belief quality. Compare against: always verify (too expensive), always import from majority (vulnerable to cascades), import from most-consistent-with-own-observations (computationally cheap consistency check).

5 Incoming

Conitzer (2013): social networks, social choice and statistical estimation unified. This seems important, so I wrote a notebook on AI Social choice
Equilibria Network - Designing New Forms Of Collective Intelligence
AI for AI for Epistemics

AI-powered tools and services that help people figure out what’s true (“AI for epistemics”) could matter a lot. As R&D is increasingly automated, AI systems will play a larger role in the process of developing such AI-based epistemic tools. This has important implications. Whoever is willing to devote sufficient compute will be able to build strong versions of the tools, quickly. Eventually, the hard part won’t be building useful systems, but making sure people trust the right ones, and making sure that they are truth-tracking even in domains where that’s hard to verify. We can do some things now to prepare. Incumbency effects mean that shaping the early versions for the better could have persistent benefits. Helping build appetite among socially motivated actors with deep pockets could enable the benefits to come online sooner, and in safer hands. And in some cases, we can identify particular things that seem likely to be bottlenecks later, and work on those directly.

6 References

Acemoglu, Chernozhukov, and Yildiz. 2006. “Learning and Disagreement in an Uncertain World.” Working Paper 12648. Working Paper Series.

Acemoglu, and Ozdaglar. 2011. “Opinion Dynamics and Learning in Social Networks.” Dynamic Games and Applications.

Aleta, and Moreno. 2019. “The Dynamics of Collective Social Behavior in a Crowd Controlled Game.” EPJ Data Science.

Almaatouq, Rahimian, Burton, et al. 2021. “When Social Influence Promotes the Wisdom of Crowds.” arXiv:2006.12471 [Physics, Stat].

Altman, and Tennenholtz. 2007. “Incentive Compatible Ranking Systems.” In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems.

An, and Du. 2026. “Methods and Open Problems in Differentiable Social Choice: Learning Mechanisms, Decisions, and Alignment.”

Arguedas, Robertson, Fletche, et al. 2022. “Echo Chambers, Filter Bubbles, and Polarisation: A Literature Review.”

Atanasov, Rescober, Stone, et al. 2015. “Distilling the Wisdom of Crowds: Prediction Markets Versus Prediction Polls.” Academy of Management Proceedings.

Bergemann, and Morris. 2003. “Robust Mechanism Design.” Levine’s Bibliography, Levine’s Bibliography,.

Borondo, Borondo, Rodriguez-Sickert, et al. 2014. “To Each According to Its Degree: The Meritocracy and Topocracy of Embedded Markets.” Scientific Reports.

Boutilier, Caragiannis, Haber, et al. 2015. “Optimal Social Choice Functions: A Utilitarian View.” Artificial Intelligence.

Carey, Langlois, Merwijk, et al. 2025. “Incentives for Responsiveness, Instrumental Control and Impact.”

Carvalho. 2010. “Sharing Rewards Based on Subjective Opinions.”

Chen, Siyu, Wu, Wu, et al. 2023. “Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model.” In Proceedings of the 40th International Conference on Machine Learning.

Chen, Yiling, and Yu. 2024. “Optimal Scoring Rule Design Under Partial Knowledge.”

Collina, Globus-Harris, Goel, et al. 2025. “Collaborative Prediction: Tractable Information Aggregation via Agreement.”

Conitzer. 2013. “The Maximum Likelihood Approach to Voting on Social Networks.” In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

Crutchfield, and Jurgens. 2025. “Agentic Information Theory: Ergodicity and Intrinsic Semantics of Information Processes.”

Danan, Gajdos, Hill, et al. 2016. “Robust Social Decisions.” American Economic Review.

Dasgupta, and Ghosh. 2013. “Crowdsourced Judgement Elicitation with Endogenous Proficiency.” In Proceedings of the 22nd International World Wide Web Conference (WWW).

Dawid, and Skene. 1979. “Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm.” Journal of the Royal Statistical Society Series C.

Diakonikolas, and Kane. 2023. Algorithmic High-Dimensional Robust Statistics.

Drolsbach, Solovev, and Pröllochs. 2024. “Community Notes Increase Trust in Fact-Checking on Social Media.” Edited by David Rand. PNAS Nexus.

Fish, Gölz, Parkes, et al. 2025. “Generative Social Choice.”

Fluri, Paleka, and Tramèr. 2024. “Evaluating Superhuman Models with Consistency Checks.” In.

Galanis, Ioannou, and Kotronis. 2024. “Information Aggregation Under Ambiguity: Theory and Experimental Evidence.” Review of Economic Studies.

Garrabrant, Benson-Tilsen, Critch, et al. 2016. “Logical Induction (Abridged).”

———, et al. 2020. “Logical Induction.”

Gneiting, and Raftery. 2007. “Strictly Proper Scoring Rules, Prediction, and Estimation.” Journal of the American Statistical Association.

Golub, and Jackson. 2011. “Network Structure and the Speed of Learning: Measuring Homophily Based on Its Consequences.” SSRN Scholarly Paper ID 1784542.

———. 2012. “How Homophily Affects the Speed of Learning and Best-Response Dynamics.” The Quarterly Journal of Economics.

Huang, Strack, and Tamuz. 2024. “Learning in Repeated Interactions on Networks.” In Econometrica.

Hyland, and Albarracin. 2025. “On the Variational Costs of Changing Our Minds.”

Hyland, Gavenčiak, Costa, et al. 2024. “Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents.” In.

Ibrahim. 2023. “Learning from Crowdsourced Noisy Annotations: From Dawid-Skene to Deep Neural Networks.”

Ince, Moresco, Peri, et al. 2025. “Constructing Elicitable Risk Measures.”

Kalai, and Lehrer. 1993. “Rational Learning Leads to Nash Equilibrium.” Econometrica.

Lalitha, Javidi, and Sarwate. 2014. “Social Learning and Distributed Hypothesis Testing.” arXiv:1410.4307 [Cs, Math, Stat].

Lanctot, Larson, Kaisers, et al. 2025. “Soft Condorcet Optimization for Ranking of General Agents.”

List, and Goodin. 2001. “Epistemic Democracy: Generalizing the Condorcet Jury Theorem.” Journal of Political Philosophy.

Mann, and Helbing. 2017. “Optimal Incentives for Collective Intelligence.” Proceedings of the National Academy of Sciences.

Matějka, and McKay. 2015. “Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model.” American Economic Review.

Maura-Rivero, Lanctot, Visin, et al. 2025. “Jackpot! Alignment as a Maximal Lottery.”

Maura-Rivero, Nagpal, Patel, et al. 2025. “Utility-Inspired Reward Transformations Improve Reinforcement Learning Training of Language Models.”

Mercier, and Claidière. 2021. “Does Discussion Make Crowds Any Wiser?” Cognition.

Miller, Resnick, and Zeckhauser. 2005. “Eliciting Informative Feedback: The Peer-Prediction Method.” Management Science.

Muldoon. 2013. “Diversity and the Division of Cognitive Labor.” Philosophy Compass.

Munos, Valko, Calandriello, et al. 2024. “Nash Learning from Human Feedback.” In Proceedings of the 41st International Conference on Machine Learning. ICML’24.

Navajas, Niella, Garbulsky, et al. 2018. “Aggregated Knowledge from a Small Number of Debates Outperforms the Wisdom of Large Crowds.” Nature Human Behaviour.

Niemeyer, Veri, Dryzek, et al. 2023. “How Deliberation Happens: Enabling Deliberative Reason.” American Political Science Review.

Olckers, and Walsh. 2024. “Manipulation and Peer Mechanisms: A Survey.” Artificial Intelligence.

Prelec. 2004. “A Bayesian Truth Serum for Subjective Data.” Science.

Prelec, Seung, and McCoy. 2017. “A Solution to the Single-Question Crowd Wisdom Problem.” Nature.

Ren, and Beard. 2005. “Consensus Seeking in Multiagent Systems Under Dynamically Changing Interaction Topologies.” Automatic Control, IEEE Transactions on.

Simon, Herbert A. 1990. “A Mechanism for Social Selection and Successful Altruism.” Science.

Simon, Herbert A. 1997. Administrative Behavior.

Sims. 2003. “Implications of Rational Inattention.” Journal of Monetary Economics.

Siththaranjan, Laidlaw, and Hadfield-Menell. 2023. “Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF.” In.

Sudhir, and Tran-Thanh. 2025. “Market-Based Architectures in RL and Beyond.”

Sunstein, and Hastie. 2014. Wiser: Getting Beyond Groupthink to Make Groups Smarter.

Thoma. 2015. “The Epistemic Division of Labor Revisited.” Philosophy of Science.

Trouche, Sander, and Mercier. 2014. “Arguments, More Than Confidence, Explain the Good Performance of Reasoning Groups.” SSRN Scholarly Paper ID 2431710.

Walters, Kaufmann, Sefas, et al. 2025. “Free Energy Risk Metrics for Systemically Safe AI: Gatekeeping Multi-Agent Study.”

Weisberg, and Muldoon. 2009. “Epistemic Landscapes and the Division of Cognitive Labor.” Philosophy of Science.

Witkowski, and Parkes. 2012. “A Robust Bayesian Truth Serum for Small Populations.” In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. AAAI’12.

Yu, Song, and Ermon. 2019. “Multi-Agent Adversarial Inverse Reinforcement Learning.” In.

Zhang, Chen, Zhou, et al. 2016. “Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing.” In Journal of Machine Learning Research.

Zollman. 2010. “The Epistemic Benefit of Transient Diversity.” Erkenntnis.

Footnotes

See also “Robust Bayesian Persuasion” and “Algorithmic Persuasion Through Simulation” for the computational side.↩︎