Groupthink and the wisdom of crowds

God help me I need to extract truth from the internet

2019-09-22 — 2026-06-23

Wherein the Conditions Under Which Diversity Among Group Members Is Found to Yield Measurable Epistemic Dividends Are Examined, Alongside Mechanisms Such as the Surprisingly Popular Algorithm.

cooperation
culture
democracy
distributed
economics
extended self
rhetoric
sociology
squad
wonk
Figure 1

To link to: getting along, swarm sensing, voting systems, democracy, groupthink. Social choice under peer influence 🏗.

When do group decisions embody the wisdom of crowds and when groupthink? How do we tie the group consensus to reality rather than let the dynamics of signalling and simulacra dominate? The formal side of this — what mechanisms can extract reliable belief from heterogeneous, strategic agents — lives in learning from the madness of crowds and in the Bayesian-epistemics literature on proper scoring rules and peer prediction. For tools and platforms that try to engineer wisdom-of-crowds outcomes by design — AI mediators, bridging algorithms, deliberative civic platforms — see civic tech and AI-mediated governance.

1 Diversity dividends

Maybe diversity and tolerance aren’t just intrinsic moral goods, but they might also pay literal dividends in terms of avoiding groupthink and being more effective. What are the conditions for this happy state?

Does diversity help attain wisdom? Sometimes, it seems. Scott Page calls this the diversity dividend. Quantifying when and how it works interests me.

Practically, see cultivating diversity.

McKinsey report, Vivian Hunt, Dennis Layton, and Sara Prince: Why diversity matters:

While correlation does not equal causation (greater gender and ethnic diversity in corporate leadership doesn’t automatically translate into more profit), the correlation does indicate that when companies commit themselves to diverse leadership, they are more successful.

(They could’ve done better than that mealy-mouthed correlation phrasing, using causal analysis.)

Other random readings: Chris Dillow, diversity trumps ability.

The new Matthew Syed book (Syed 2020) (titled Rebel Ideas or Superteams depending on where you are) apparently covers some of this material.

Figure 2

2 Social structure of knowledge

Vested interests, contrarians, consensus. These generate some stylised dynamics in the social structure of knowledge which I would like to explore with mathematical models and simulations.

Scott Aaronson on “armchair epidemiology” uses the COVID-19 public communication fiasco as a lens on societal collective knowledge and science and the role of contrarians. Connection to red queen signal dynamics should be apparent. The comment threads in that post meander around this topic at length.

This resembles another pyramid of fashionable disagreement that he mentions, the Intellectual Hipsters and Meta-Contrarianism pyramid.

3 Crowdsourcing models

Due to Dawid and Skene (1979) and descendants.

In the classical setup, there is a latent ground truth label, and \(n\) noisy annotators each report with an unknown confusion matrix. There exist methods (expectation-maximisation) to estimate these matrices; spectral methods (Zhang et al. 2016) can identify annotator quality without ground truth. The limitation for our purposes is that Dawid-Skene-type methods assume all annotators label the same items. Internet authors write about different things. Still, the mathematical machinery — latent truth, heterogeneous noise, identification via redundancy — might be useful.

4 Crowd meta-knowledge

We can hope to extract truth from crowds by thinking about incentives, Bayes, and meta-knowledge. Key works here are Prelec (2004) and descendants (Witkowski and Parkes 2012; Miller, Resnick, and Zeckhauser 2005). The “surprisingly popular” (SP) algorithm (Prelec 2004; Prelec, Seung, and McCoy 2017) seems like a good start: it extracts truth from crowds by finding answers that are more popular than predicted. Concretely: people who hold the correct-but-minority view often know their view is rare, so they predict lower support for it from everyone else. The SP algorithm uses this gap between actual and predicted popularity as a signal — the gap between first-order beliefs (what I think) and second-order beliefs (what I think others think). Does the structure of the internet corpus contain enough second-order information to support something like this? Blog posts often respond to other positions, explicitly modelling what “they” believe. This meta-discursive structure might be informative in the SP sense — the internet running a massively distributed, ramshackle version of the same trick, finding coherent patterns that show up more than we’d expect if the corpus were just noise (cf. Collina et al. (2025) on collaborative prediction by sharing predictions, not data).

5 Robust statistics

We can estimate the mean of a distribution even when an \(\varepsilon\)-fraction of samples are adversarially corrupted, as long as the clean distribution has bounded moments. This is because, in high dimensions, adversarial corruption distorts the covariance in detectable ways. Specifically, it introduces “spurious” eigenvalues, which we can then filter out. By analogy we might hope that even if \(\varepsilon\) of the internet is adversarially generated, if the “clean” distribution over text had enough structure, we could learn through the noise. The book to read on this is Diakonikolas and Kane (2023).

This is an analogy; making it do real work needs more. What are the “bounded moments” assumptions in the text setting? What does “adversarial corruption” even mean when the data has sequential structure? 🏗

6 Incoming

7 References

Acemoglu, Chernozhukov, and Yildiz. 2006. Learning and Disagreement in an Uncertain World.” Working Paper 12648. Working Paper Series.
Acemoglu, and Ozdaglar. 2011. Opinion Dynamics and Learning in Social Networks.” Dynamic Games and Applications.
Aleta, and Moreno. 2019. The Dynamics of Collective Social Behavior in a Crowd Controlled Game.” EPJ Data Science.
Almaatouq, Rahimian, Burton, et al. 2021. When Social Influence Promotes the Wisdom of Crowds.” arXiv:2006.12471 [Physics, Stat].
Arguedas, Robertson, Fletche, et al. 2022. Echo Chambers, Filter Bubbles, and Polarisation: A Literature Review.”
Atanasov, Rescober, Stone, et al. 2015. Distilling the Wisdom of Crowds: Prediction Markets Versus Prediction Polls.” Academy of Management Proceedings.
Banerjee. 1992. A Simple Model of Herd Behavior.” The Quarterly Journal of Economics.
Baron. 2005. So Right It’s Wrong: Groupthink and the Ubiquitous Nature of Polarized Group Decision Making.” In Advances in Experimental Social Psychology.
Board, and Meyer-ter-Vehn. 2021. Learning Dynamics in Social Networks.” Econometrica.
Borondo, Borondo, Rodriguez-Sickert, et al. 2014. To Each According to Its Degree: The Meritocracy and Topocracy of Embedded Markets.” Scientific Reports.
Brewer. 1993. Social Identity, Distinctiveness, and In-Group Homogeneity.” Social Cognition.
Collina, Globus-Harris, Goel, et al. 2025. Collaborative Prediction: Tractable Information Aggregation via Agreement.”
Conitzer. 2013. The Maximum Likelihood Approach to Voting on Social Networks.” In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).
Coscia, and Vandeweerdt. 2022. Posts on Central Websites Need Less Originality to Be Noticed.” Scientific Reports.
Danan, Gajdos, Hill, et al. 2016. Robust Social Decisions.” American Economic Review.
Dawid, and Skene. 1979. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm.” Journal of the Royal Statistical Society Series C.
Diakonikolas, and Kane. 2023. Algorithmic High-Dimensional Robust Statistics.
Dinesen, and Sønderskov. 2013. Ethnic Diversity and Social Trust: The Role of Exposure in the Micro-Context.” Ethnic Diversity and Social Capital.
Farrell, and Shalizi. 2015. Pursuing Cognitive Democracy.” From Voice to Influence: Understanding Citizenship in a Digital Age; Allen, D., Light, J., Eds.
Fügener, Grahl, Gupta, et al. 2021. Will Humans-in-the-Loop Become Borgs? Merits and Pitfalls of Working with AI.” MIS Quarterly.
Fu, and Wang. 2008. Coevolutionary Dynamics of Opinions and Networks: From Diversity to Uniformity.” Physical Review E.
Garip. 2020. What Failure to Predict Life Outcomes Can Teach Us.” Proceedings of the National Academy of Sciences.
Garrabrant, Benson-Tilsen, Critch, et al. 2020. Logical Induction.”
Golub, and Jackson. 2010. Naïve Learning in Social Networks and the Wisdom of Crowds.” American Economic Journal: Microeconomics.
———. 2011. Network Structure and the Speed of Learning: Measuring Homophily Based on Its Consequences.” SSRN Scholarly Paper ID 1784542.
———. 2012. How Homophily Affects the Speed of Learning and Best-Response Dynamics.” The Quarterly Journal of Economics.
Haghtalab, Jackson, and Procaccia. 2020. Belief Polarization in a Complex World: A Learning Theory Perspective.” SSRN Scholarly Paper ID 3606003.
Hertz, Romand-Monnier, Kyriakopoulou, et al. 2016. Social Influence Protects Collective Decision Making from Equality Bias.” Journal of Experimental Psychology. Human Perception and Performance.
Hong, and Page. 2004. Groups of Diverse Problem Solvers Can Outperform Groups of High-Ability Problem Solvers.” Proceedings of the National Academy of Sciences.
Horwitz, and Horwitz. 2007. The Effects of Team Diversity on Team Outcomes: A Meta-Analytic Review of Team Demography.” Journal of Management.
Ibrahim. 2023. Learning from Crowdsourced Noisy Annotations: From Dawid-Skene to Deep Neural Networks.”
Jackson. 2009. Social Structure, Segregation, and Economic Behavior.” Presented as the Nancy Schwartz Memorial Lecture.
Jeppesen, and Lakhani. 2010. Marginality and Problem-Solving Effectiveness in Broadcast Search.” Organization Science.
Johnson, Velasquez, Restrepo, et al. 2021. Mainstreaming of Conspiracy Theories and Misinformation.”
Klug, and Bagrow. 2016. Understanding the Group Dynamics and Success of Teams.” Royal Society Open Science.
Kong. 2019. Dominantly Truthful Multi-Task Peer Prediction with a Constant Number of Tasks.” arXiv:1911.00272 [Cs, Econ].
Lalitha, Javidi, and Sarwate. 2014. Social Learning and Distributed Hypothesis Testing.” arXiv:1410.4307 [Cs, Math, Stat].
Lee, and Nathan. 2011. Does Cultural Diversity Help Innovation in Cities: Evidence from London Firms.” LSE Research Online Documents on Economics.
List, and Goodin. 2001. Epistemic Democracy: Generalizing the Condorcet Jury Theorem.” Journal of Political Philosophy.
Lorenz. 2010. Heterogeneous Bounds of Confidence: Meet, Discuss and Find Consensus! Complexity.
Lublin. 2015. New Report Finds a ‘Diversity Dividend’ at Work.” WSJ (blog).
Mahmoodi, Bang, Olsen, et al. 2015. Equality Bias Impairs Collective Decision-Making Across Cultures.” Proceedings of the National Academy of Sciences.
Mann, and Helbing. 2017. Optimal Incentives for Collective Intelligence.” Proceedings of the National Academy of Sciences.
Masuda, and Redner. 2011. “Can Partisan Voting Lead to Truth?” Journal of Statistical Mechanics: Theory and Experiment.
Mercier, and Claidière. 2021. Does Discussion Make Crowds Any Wiser? Cognition.
Miller, Resnick, and Zeckhauser. 2005. “Eliciting Informative Feedback: The Peer-Prediction Method.” Management Science.
Moussaïd, Kämmer, Analytis, et al. 2013. Social Influence and the Collective Dynamics of Opinion Formation.” PLoS ONE.
Navajas, Niella, Garbulsky, et al. 2018. Aggregated Knowledge from a Small Number of Debates Outperforms the Wisdom of Large Crowds.” Nature Human Behaviour.
Niemeyer, Veri, Dryzek, et al. 2023. How Deliberation Happens: Enabling Deliberative Reason.” American Political Science Review.
O’Connor, and Wu. 2021. How Should We Promote Transient Diversity in Science?
Olfati-Saber, Fax, and Murray. 2007. Consensus and Cooperation in Networked Multi-Agent Systems.” Proceedings of the IEEE.
Page. 2008. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies - New Edition.
———. 2011. Diversity and Complexity. Primers in Complex Systems.
Peters, and Adamou. 2015. An Evolutionary Advantage of Cooperation.” arXiv:1506.03414 [Nlin, q-Bio, q-Fin].
Prelec. 2004. A Bayesian Truth Serum for Subjective Data.” Science.
Prelec, Seung, and McCoy. 2017. A Solution to the Single-Question Crowd Wisdom Problem.” Nature.
Radanovic, and Faltings. 2013. A Robust Bayesian Truth Serum for Non-Binary Signals.” In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence. AAAI’13.
Ren, and Beard. 2005. “Consensus Seeking in Multiagent Systems Under Dynamically Changing Interaction Topologies.” Automatic Control, IEEE Transactions on.
Skerry. 2002. Beyond Sushiology: Does Diversity Work? Brookings Institution (blog).
Stelmakh, Rastogi, Shah, et al. 2020. A Large Scale Randomized Controlled Trial on Herding in Peer-Review Discussions.”
Sunstein, and Hastie. 2014. Wiser: Getting Beyond Groupthink to Make Groups Smarter.
Syed. 2020. Rebel Ideas: The Power of Diverse Thinking.
Trouche, Sander, and Mercier. 2014. Arguments, More Than Confidence, Explain the Good Performance of Reasoning Groups.” SSRN Scholarly Paper ID 2431710.
van den Steen. 2010. Culture Clash: The Costs and Benefits of Homogeneity.” Management Science.
Weisbuch, Deffuant, Amblard, et al. 2002. Meet, Discuss, and Segregate! Complexity.
Witkowski, and Parkes. 2012. A Robust Bayesian Truth Serum for Small Populations.” In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. AAAI’12.
Wojcik, Hilgard, Judd, et al. 2022. Birdwatch: Crowd Wisdom and Bridging Algorithms Can Inform Understanding and Reduce the Spread of Misinformation.”
Xu, and Dean. 2023. Decision-Aid or Controller? Steering Human Decision Makers with Algorithms.”
Zhang, Chen, Zhou, et al. 2016. Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing.” In Journal of Machine Learning Research.