Epistemic communities
Descriptive and normative
2021-08-24 — 2026-04-17
Wherein the several aims of collective truth-seeking are enumerated, and their mutual incompatibility is noted, with peer review instanced as the community most amenable to experimental manipulation.
There is a lot to say about epistemic communities in the abstract. Will I ever get around to saying it in full, or will I defer my effort in favour of specific cases? Let us find out.
What do I even mean by “epistemic community”? I have in mind any collective that produces and maintains shared knowledge, whether that’s a scientific discipline, a news media ecosystem, or a Wikipedia subcommunity. The common thread is that they are all trying to figure out what’s true, and they have some kind of social structure that they hope to leverage to help them do it.
The science community has thought hard about this; journalism seems to have thought about it somewhat. I think it is becoming relevant in new ways in online communities and civic tech.
1 What is a good epistemic?
We can ask an epistemic community to do any of the following:
- converge on true beliefs at the margin, as fast as possible
- maintain enough diversity that the community does not lock in on a confident falsehood
- produce consensus-backed action (e.g. public-health messaging)
- let individuals skim useful knowledge cheaply
- ratify the status of its members
These aims trade off. A community optimised for fast convergence will purge heterodoxy we might later need (O’Connor, Goldberg, and Goldman 2024; Weatherall, O’Connor, and Bruner 2018). One optimised for diversity will struggle to close questions that actually are closed. One that grades its members on status will find its incentives drifting off the object level.
So the first design question is: which of these are we asking of this particular community, under what budget? IMO a lot of debate about “trust in experts” is at bottom a debate about which of these we think experts are for, which is why the participants talk past each other.
2 Bayesian epistemics
An inverse design approach: how should we design a system to optimise for truthfulness? See Bayesian epistemics. The peer-prediction and Bayesian Truth Serum family — which also appears in peer review — is the sharp end of this: we can elicit truthful reports without ground truth, at least in principle.
3 Formal models of the enterprise
Bayesian epistemics gives the micro-foundations; an adjacent literature takes the next step and asks what happens once we wire scoring rules into the principal-agent problems that actually constitute a scientific community.
Chen et al. (2023) combine proper scoring rules with the principal-agent model: under mild conditions, a principal can learn to incentivise accurate information acquisition by paying agents according to a scoring rule on their reported posteriors. This connects the Bayesian epistemics toolkit to the design of research grants — it is at least thinkable that a funding agency could run something like a scoring-rule market and do better than “did the results look plausible to a panel”.
The paper-reader relationship is also principal-agent, but messier, because the reader is not the principal who paid the author — they are paying only with their attention. Wei (2021) models this: when the receiver has to pay a cost to process a signal (the realistic case for scientific literature), the sender has latitude to persuade rather than inform, because the receiver cannot afford to audit everything. This formalises the worry that a lot of what looks like science is in fact rhetoric aimed at busy reviewers, and it tells us when the failure mode bites hardest — high reader-cost plus high author-reputation-return equals a persuasion equilibrium.
Aggregation can fail in its own way. Kong and Schoenebeck (2022) study what happens when individual agents’ signals are correlated and the agents do not know how correlated they are: the aggregate can be confidently wrong. Prediction-market prices, under their model, can sustain false consensus for longer than one might hope. A useful corrective if we were hoping prediction markets would simply solve the problem for us.
Mann and Helbing (2017) ask what incentive scheme maximises the diagnostic accuracy of a committee that has to decide, from its members’ private signals, what is true. The minimum-variance unbiased answer is not “majority vote with equal weights”; it depends on the correlation structure of the signals and on how we reward members for reports that turn out to be right versus wrong. The optimal reward scheme pays disproportionately for dissent that turns out to be correct — formal backing for the diversity-of-practice argument, and a quantitative counterpart to the bridging-based-ranking intuition below.
4 The deference economy
A community that produces knowledge has to decide whose claims to defer to. It seems to me that most “epistemic crises” in public debate actually play out here — not on the object level, but on the meta level of which sources carry weight and why.
Two relevant notebooks:
- Status draws the standard distinction between dominance and prestige. Prestige — deference voluntarily conferred toward competence — is the ingredient that makes scientific cooperation scale beyond kin.
- Reputation systems surveys attempts to formalize prestige into something a machine can compute.
The usual failure mode is that the prestige signal decouples from the underlying competence. Dan Williams (Williams 2022, 2023) has a long-running argument that many apparently-crazy beliefs are ingroup signalling dressed up as epistemics. That framing lines up with invasive arguments and with Timur Kuran’s preference-falsification cascades1. If the reward for expressing belief B is uncorrelated with B being true, we should expect B to drift.
Science-studies has formal models of this. Cailin O’Connor and James Weatherall’s agent-based network models (O’Connor and Weatherall 2017; O’Connor 2024; Weatherall, O’Connor, and Bruner 2018) show how an innocently-homophilous community can lock in false consensus. Paul Thagard’s coherence-based work (Thagard 1993, 2005) covers related ground from a cognitive-science angle. David Wolpert’s stochastic/thermodynamic analyses (Wolpert and Kinney 2023, 2024; Wolpert and Harper 2025) are on my list but I haven’t cracked them yet. I should probably work through the Seselja survey (Šešelja 2022) before forming strong opinions here.
5 Peer review
Peer review is the most-instrumented epistemic community we have. People bolt on incentive mechanisms and watch what happens; they A/B test on NeurIPS. Some fuzzy lessons learned:
- Admission control. We get more effort out of reviewers if reviewing is priced into the right to submit. This is a baby version of the general problem: how do we stop free-riding on the commons of attention?
- Randomness against manipulation. Deterministic assignment is gameable; controlled randomness buys robustness and cheap experiments.
- Aggregating miscalibrated judges. People use 1–10 scales differently. Simple averaging bakes the noise in. Social choice theory has something to say about this, as do empirical calibration methods.
- Elicitation with proper scoring rules. Asking reviewers to forecast observable outcomes, then scoring those forecasts, is incentive-compatible in a way that rubric ratings are not.
All four of these apply to any epistemic community that has to aggregate distributed expert judgement — editorial boards, Wikipedia’s admin layer, Q&A sites, Community Notes, grand juries, even deliberative-poll experiments. TODO: write this comparison out properly.
A subtler lesson from the NeurIPS experiments: the process itself is the experimental subject. If we never A/B test our community’s rules, we cannot learn. A lot of institutions are “unfalsifiable” in this sense, which is a design mistake, not a neutrality.
6 AI at the margins
LLMs change the economics in ways I have not finished thinking through. The cases I find most concrete:
- Automated criticism. AI-assisted review for scientific papers (Allen-Zhu and Xu 2025); the Black Spatula Project; o1 catching arithmetic errors; roastmypost. Arbitration — analysing an existing dispute — is cognitively easier than generating a novel critique, which suggests a division of labour where LLMs mop up the mechanical-correctness pass and humans focus on taste and framing.
- Bridging algorithms. Community Notes uses a matrix-factorisation ranker that surfaces notes rated helpful by users who usually disagree; Polis and vTaiwan apply a related pattern to structured deliberation. See bridging-based ranking for the mathematics and the broader family (Ovadya & Thorburn’s writeups, the Forethought design sketches for collective epistemics).
- Deliberative tooling. Polis / vTaiwan provide structured, AI-assisted group deliberation. The DeepMind “Habermas Machine” experiment2 is in the same family.
- Degraded signals. An LLM-saturated Quora is worse than the pre-LLM Quora. The cheaper passable-looking text gets, the more expensive it becomes to sort a community from its doppelgängers. Related: spamularity, misinformation, AI persuasion.
I suspect “does AI help or hurt?” is the wrong question — it’s underspecified. Better: how does AI shift the cost structure of the underlying mechanism? A community whose signal was mostly “could this person write a passable essay” is in trouble; one whose signal was mostly “could this person stake their reputation on a concrete forecast” is not.
7 Paying for truth
Internal review mechanisms are only half of the problem; the other half is the external payoff. If the reward for producing engaging slop is higher than the reward for producing truthful content, truth loses at the margin no matter how well the internal mechanism is designed. Public sphere business models § Paying for truth walks through the candidate funders (advertising, subscriptions, patronage, public broadcasting, philanthropy, prediction markets, reputation economies) and their characteristic failure modes.
To a first approximation, truth is a public good whose quality is only legible in the long run, which puts the problem close to “how do we fund basic research” — old and only partially solved.
8 Grab bag of design choices
Common interesting design themes from the above:
- Entry and exit costs — admission control, review credits, Sybil resistance (see reputation systems).
- Scale and context — Dunbar-number limits, subgroup granularity, context collapse. See Gordon Brander on thinking together and common knowledge.
- Anonymity vs pseudonymity vs identified. The peer-review trade-offs generalise.
- Elicitation format — freeform text, numeric rubric, probabilistic forecast, peer-prediction report.
- Aggregation rule — simple average, calibrated average, matrix-factorised bridging, prediction-market price, jury.
- Feedback loops — do members see outcomes? does the community know its own calibration?
- Adversarial robustness — collusion rings, cycles, fake identities, bot swarms, LLM spam.
- Funding source, and how it colours the above.
Most interesting design work happens at the intersection of two or three of these.
9 Research I should name-check
In no particular order.
- Cailin O’Connor and James Weatherall, The Misinformation Age, plus the underlying network-of-scientists simulations (O’Connor and Weatherall 2017; O’Connor 2024; O’Connor, Goldberg, and Goldman 2024; Weatherall, O’Connor, and Bruner 2018). In-bib; need to actually read.
- Jonathan Rauch, The Constitution of Knowledge. Popular rather than technical; useful for the framing of “reality-based community” as a social institution with rules. (Rauch 2021)]
- Hugo Mercier and Dan Sperber, The Enigma of Reason. The argumentative theory of reasoning implies that epistemic communities, not individuals, are the unit of cognition. (Mercier and Sperber 2017)
- Dan Williams, Conspicuous Cognition. Blog aside, his journal work on epistemic signalling (Williams 2021a, 2022, 2023) is already in-bib.
- Kevin Zollman on scientific network topology and the epistemic value of transient disagreement (the “Zollman effect”). (Kevin J. S. Zollman 2007; Kevin JS. Zollman 2012)
- Henry Farrell and Cosma Shalizi, Cognitive Democracy, for the analytical frame. Farrell’s later work on AI-era epistemics (Farrell, Mercier, and Schwartzberg 2022) is in-bib.
- Timur Kuran, Private Truths, Public Lies (preference falsification). Sits naturally next to Williams. (Kuran 1997)
- Frischmann, Madison, and Strandburg, Governing Knowledge Commons (Cambridge). Ostromian analysis applied specifically to knowledge production. (Frischmann, Madison, and Strandburg 2014)
- Elinor Ostrom, Governing the Commons. Already arrives via public sphere business models; probably needs its own entry point here.
- Aviv Ovadya and Luke Thorburn on bridging-based ranking and “platform democracy”. TODO cite.
- Renée DiResta, Invisible Rulers. Influence operations as an industry. TODO cite.
- Chris Bail, Breaking the Social Media Prism. Empirical work on echo chambers where the results are not the obvious ones (cf. (Eady et al. 2023)).
- J. Nathan Matias and the Citizens and Technology Lab. Experimental manipulation of online-community norms.
- David Wolpert, stochastic/thermodynamic analyses of epistemic communities (Wolpert and Kinney 2023, 2024; Wolpert and Harper 2025).
- Paul Thagard on coherentist social epistemology (Thagard 1993, 2005).
- Seselja et al. (Šešelja 2022), a survey of agent-based models of epistemic communities. Read this before the originals.
- Community Notes research program (Birdwatch / Wojcik et al.). TODO cite.
- Audrey Tang, Polis, and vTaiwan writeups. TODO cite.
- Jason McKenzie Alexander on the norms of science and evolutionary game theory of epistemic communities. TODO cite.
- Joshua Habgood-Coote on whether “misinformation” is a well-posed category at all. TODO cite.
10 See also
- science generation, science validation for the specific case of science.
- science peer review for the micro-mechanics.
- public sphere business models, renewing journalism for the funding side.
- mechanism design, soft mechanism design, Bayesian epistemics for the formal tools.
- status, reputation systems for the currency of credibility.
- invasive arguments, misinformation, spamularity, AI persuasion for the adversarial side.
- common knowledge, wisdom of crowds vs groupthink, knowledge topology, epistemic bottlenecks for the shape of what we collectively know.
- civic tech, social choice, prediction markets for aggregation into decisions.
- evolution of perception and truth, truth effectiveness heat pump for whether the whole endeavour is well-posed.
11 Incoming
News media and the public’s shared reality. Fake news, indeterminate news, incomplete news, alternative facts, strategic inference, kompromat, agnotology, facebooking to a molecular level. Basic media literacy, and whether it helps. As seen in elections and provocateur twitter bots.
AI Tools for Trust: Community Notes, Rhetoric Detection & More
Dan Williams, Can experts save the public from error?
Erik Hoel, The gossip trap
Gordon Brander, Thinking together, on egregores, Dunbar numbers and information-processing thresholds in Holocene social evolution, which all motivate
- Noosphere, a protocol for thought
- source
- The explainer is poor and doesn’t even tell us how to start one of these noospheres or do anything in particular, which indicates a suspect development model.
Why Quora isn’t useful anymore: A.I. came for the best site on the internet.
Marisa Abrajano has a provocative list of research topics. I’d like to read her work to see her methodology.
Marisa Abrajano is professor of political science at the University of California, San Diego. She is also Provost of Earl Warren College. Her research interests focus on racial and ethnic inequalities in the political system, particularly with political participation, voting and campaigns, and the mass media. She is the author of five books. Her latest book, in collaboration with Nazita Lajevardi, explores the politics of misinformation amongst socially marginalized groups
Samuel Butler:
The public buys its opinions as it buys its meat, or takes in its milk, on the principle that it is cheaper to do this than to keep a cow. So it is, but the milk is more likely to be watered.

