Civic technology

Utopian governance using technology, including generative AI, Electrohabermas, digital deliberation, platform democracy

2025-10-27 — 2026-05-07

In Which AI Mediation of Collective Deliberation Is Examined as an Epistemic Mechanism, With Bridging-Based Aggregation Systems Such as Community Notes and Polis Considered as Means of Surfacing Cross-Partisan Consensus.

adversarial
AI safety
bounded compute
communicating
cooperation
culture
economics
extended self
faster pussycat
incentive mechanisms
institutions
language
machine learning
markets
mind
money
neural nets
NLP
security
sovereign
technology
wonk
Figure 1

The companion notebook on utopian governance asks what systems should we have? — sortition, futarchy, liquid democracy, and other institutional designs. This notebook asks a different question: what platforms and tools might help us actually govern better? In particular, what’s the best, kindest, and wisest collective behaviour we could achieve if generative AI and digital platforms helped mediate governance?

A partial counterpart to AI disempowerment of humans is AI empowerment of collective decision-making. This isn’t the same as wondering how we might democratize AI — that’s some kind of dual question and also interesting.

1 An epistemic problem

Governance is partially an epistemic problem: how does a collective discover which policies will actually produce good outcomes, given that no individual knows enough? It can also be a problem of more naked coercion, coalition dynamics, deception, and so on; we mostly ignore those here. Civic tech is for the slice that tools can plausibly do work on: deliberation, truth-finding, preference aggregation, plus the psychological layer that comes with putting humans in the loop. The surrounding politics — power, identity, mammalian ego — sits mostly outside what mechanism design can touch; we deliberately scope down to where the tools have traction, on the bet that any functioning coalition, movement, team, or parliament has to do that epistemic work to coordinate at all, regardless of how much else is in play. The deliberative tools must still operate inside environments shaped by adversarial manipulation, captured platforms, and engagement-economy incentives that reward division (Ovadya and Thorburn 2023) — but that is a robustness question for the same epistemic mechanism, not a different kind of problem. For the broader context of how communities form and maintain shared knowledge, see epistemic communities; for the broader theory of decision-making under bounded compute, see political economy of cognition.

Social choice theory frames this as preference aggregation — how to combine what people want. Preferences, though, are coupled with beliefs in an interesting way. People don’t disagree about climate policy because they want different temperatures. They can disagree in good faith because they hold different models of how the economy, the atmosphere, and political institutions interact, and whether any of these will in fact impact them in any important way. The preference-aggregation framing (voting, polls, referenda) does not seem to be a tool for the belief-aggregation job.

Several mechanisms in the utopian governance notebook attack this directly. Prediction markets aggregate beliefs by rewarding accuracy — but they tell us what people think will happen, not what would happen if we intervened (the causal validity problem). Futarchy tries to bridge the gap by conditioning markets on policy choices, but inherits the causal difficulties. Reputation systems are a softer version of the same idea: weight opinions by track record rather than by majority.

The belief/preference framing also misses a psychological layer. Much of contemporary democratic dysfunction looks less like ideological disagreement and more like affective polarization — emotional dislike of the out-party, more or less independent of any policy disagreement. A surprising amount of the deliberative-AI literature targets this layer rather than the belief layer: the goal is “conflict transformation” rather than consensus-finding (Ovadya and Thorburn 2023; Ovadya et al. 2024), enabling productive conflict by closing the perception gap — our beliefs about other people’s beliefs, which are systematically more extreme than the actual distribution. The intergroup-contact and conflict-mediation literatures are the relevant precursors here, alongside social choice. For the underlying group-decision dynamics these tools are trying to engineer for — diversity dividends, information cascades, surprisingly-popular polling, contrarian-vs-consensus tradeoffs — see groupthink and the wisdom of crowds.

The rest of this notebook surveys mechanisms that take a different angle on the same problem. AI-mediated deliberation generates new statements that bridge between positions rather than aggregating existing ones — closer to what Habermas meant by the ideal speech situation: not a vote, but a process that produces justified consensus through structured dialogue. Bridging-based aggregation does more intimate work with political discussions themselves: it uses the signal of who agrees with whom to surface the statements that already cross divides. Participatory platforms try to make input itself binding rather than advisory. Each of these is a partial answer to the problem of healthy deliberation. A more radical alternative — delegated agent governance — gives each principal a fiduciary AI agent and lets agents do the negotiating, with the political and economic spheres potentially merging into a Coasean bargaining stack. The tools surveyed in this notebook all keep the human in the deliberation loop. The design question is how to combine them — markets for factual questions, deliberation for value-laden ones, reputation for weighting expertise, whatever else it takes — and whether AI mediation gets us closer to truth, or just closer to feeling agreeable.

This topic probably deserves its own notebook when the literature matures; for now, eh, you’ve got my rambles.

For the technical question of whether AI systems can serve as epistemic infrastructure rather than just deliberative infrastructure — extracting reliable belief from noisy strategic crowds, rather than only mediating discussion — see learning from the madness of crowds. For the broader institutional-design question — what trade-offs an epistemic community is making between fast convergence, diversity, action-readiness, skimmable summaries, and member-status ratification, and what mechanisms (peer review, prestige, scoring rules, randomised assignment, admission control) discipline those trade-offs — see epistemic communities. The civic-tech tools surveyed here are one slice of that design space, biased toward large-group deliberative outputs.

TODO: connect to Tetlock’s superforecasting literature?

2 AI-mediated deliberation

Can AI help divided groups find common ground? Early experiments suggest yes — and perhaps better than human facilitators.

2.1 The Habermas Machine

Ekeoma Uzogara’s summary of Tessler et al. (2024):

To act collectively, groups must reach agreement; however, this can be challenging when discussants present very different but valid opinions. Tessler et al. (2024) investigated whether artificial intelligence (AI) can help groups reach a consensus during democratic debate (see Nyhan and Titiunik (2024)). The authors trained a large language model called the Habermas Machine to serve as an AI mediator that helped small UK groups find common ground while discussing divisive political issues such as Brexit, immigration, the minimum wage, climate change, and universal childcare. Compared with human mediators, AI mediators produced more palatable statements that generated wide agreement and left groups less divided. The AI’s statements were more clear, logical, and informative without alienating minority perspectives. This work carries policy implications for AI’s potential to unify deeply divided groups.

See also: (Hernández 2025; Volpe 2025). For how this kind of deliberation works at smaller scales without AI, see community governance.

2.2 Team Mirai

Team Mirai is a Japanese experiment in running lots of governance-cleverness mechanisms at once. See Team Mirai and Democracy:

Imagine an election where every voter has the opportunity to opine directly to politicians on precisely the issues they care about. They’re not expected to spend hours becoming policy experts. Instead, an AI Interviewer walks them through the subject, answering their questions, interrogating their experience, even challenging their thinking.

Voters get immediate feedback on how their individual point of view matches—or doesn’t—a party’s platform, and they can see whether and how the party adopts their feedback. This isn’t like an opinion poll that politicians use for calculating short-term electoral tactics. It’s a deliberative reasoning process that scales, engaging voters in defining policy and helping candidates to listen deeply to their constituents.

This is happening today in Japan. Constituents have spent about eight thousand hours engaging with Mirai’s AI Interviewer since 2025. The party’s gamified volunteer mobilization app, Action Board, captured about 100,000 organizer actions per day in the runup to last week’s election.

It’s how Team Mirai, which translates to ‘The Future Party,’ does politics.

3 Bridging-based aggregation

A family of mechanisms with a shared move: use the signal in who agrees with whom to find content, statements, or policy positions that cross ideological divides. Unlike the Habermas Machine, these systems don’t generate new text; they select among what participants already wrote. Two deployed examples worth knowing are X’s Community Notes and the Polis / vTaiwan stack.

3.1 Community Notes

Community Notes (formerly Birdwatch) surfaces fact-checks on X posts only when the system infers that a note is rated helpful by raters who usually disagree. The published algorithm is a matrix-factorisation model:

\[ r_{un} \;=\; \mu \;+\; i_n \;+\; b_u \;+\; f_u^\top f_n \;+\; \varepsilon_{un}. \]

Here \(r_{un}\) is user \(u\)’s rating of note \(n\) (helpful / not helpful, coded numerically), \(\mu\) is a global intercept, \(b_u\) is a per-user “how positive is this rater” bias, \(i_n\) is the note intercept, and \(f_u^\top f_n\) is the dot product of a low-dimensional user factor and a note factor. Parameters are fit by regularized least squares over the observed ratings.

The factors \(f_u\) and \(f_n\) absorb the main axis of polarity — roughly, partisanship. Their dot product predicts the disagreement pattern: how user polarity aligns with note polarity. What’s left in \(i_n\) is the helpfulness signal with the polarity component divided out. A note with high \(i_n\) is one that raters across the polarity axis converge on calling helpful. That is the bridging score, and a note is shown publicly only when \(i_n\) clears a threshold.

Some caveats:

  • It’s a rank-1 factorization (at least, in the deployed version) — one dominant axis of disagreement is assumed. If the real disagreement graph has factions who disagree on three orthogonal axes, we are regressing out one axis and projecting the others onto that in some sense. That might be an improvement over unweighted averaging, but it is not bridging in a strong sense.
  • \(i_n\) is not identifiable without regularization; the regularizer (is it a prior?) on \(f_u\) and \(f_n\) is necessary but its choice affects what counts as “bridging” out in the tail.
  • Adversarial robustness emerges because a manipulator has to coordinate raters with divergent \(f_u\) to move \(i_n\), which is costlier than coordinating raters inside one faction.

3.2 Polis and vTaiwan

Polis solves a related problem with a related pipeline, aimed at structured deliberation rather than ranking fact-checks. Participants submit short statements; everyone votes agree / disagree / pass on the statements of others. The agree/disagree matrix is factored by PCA, giving each participant a 2-D position on an “opinion map”, and \(k\)-means clusters participants into opinion groups (typically two to four). Per-statement consensus is computed across clusters: a group-informed consensus statement is one that substantially every cluster, weighted by cluster size, agrees with.

The Taiwan g0v community and the subsequent vTaiwan process, under Audrey Tang, ran Polis at policy scale — producing consensus recommendations on Uber regulation, fintech licensing, online alcohol sales, and more. Tang’s broader framing, “Plurality,” treats this family of tools as infrastructure for collective intelligence across diversity.

Decoder: Polis differs from Community Notes along the following axes:

Community Notes Polis
Output per-item score (rank) per-statement consensus + opinion map
Time continuous, per note session-based
Adversarial pressure high lower (smaller audience)
Downstream use automatic display human facilitators curate
Rank 1 polarity axis 2 PCA components

3.3 The broader family

Aviv Ovadya and Luke Thorburn’s bridging systems work (Ovadya and Thorburn 2023; Ovadya et al. 2024) generalises the design pattern across recommender systems, collective response systems, and human-facilitated mini-publics, and flags its characteristic failure modes: what if there is no bridge? what if the apparent “bridge” is a false consensus because we have only modelled one axis of disagreement? what if “bridging” is just a polite name for asymmetric concession from whichever side is less coordinated? Related experimental infrastructure is discussed at the Meaning Alignment Institute and the AI & Democracy Foundation.

For the preference aggregation problem, see social choice (classical preference aggregation), reputation systems (iterative weighting), and epistemic communities on the “whose judgement carries weight” question.

TODO: ingest Wojcik et al. (2022).

4 Participatory civic platforms

Most “participation” in existing democracies is consultation: comment boxes, public submissions, surveys, town halls. The institution asks for input, then decides what to do with it. This is better than nothing, but it doesn’t change the governance structure — the same people make the same decisions, just with more information (which they may or may not use).

The tools below aim at something stronger: participation-as-governance, where the mechanism design of the platform itself determines how input translates into outcomes. Participatory budgeting with binding commitments, consent-based policy revision, structured deliberation with decision rules — these aren’t just input channels, they’re alternative governance architectures. The distinction in the threat models is that the failure mode of consultation is captured input (powerful voices dominate the comment box), while the failure mode of governance-by-platform is mechanism failure (the rules produce perverse outcomes). Different failure modes require different defences.

Not all of this is AI-dependent — much of it is about building better infrastructure for human participation, that is to say, it is something closer to UX design than to machine learning.

See also platform democracy and kinder social media on redesigning online public spaces, and delegated agent economies on what happens when AI agents act on our behalf.

4.1 Metagov

Metagov hosts a stable of interesting projects for online community governance. Joshua Tan is head of research; I’m keen to see what the organisation does next. See also Metagov News (Special AI Issue) - Nov 2025.

  • KOI pond

    Knowledge Organisation Infrastructure (KOI) is an open protocol that allows communities to collaboratively manage knowledge on their own terms while remaining interoperable with others. Developed by BlockScience with contributions from Metagov and the Australian Research Council Centre of Excellence for Automated Decision-Making and Society (ADM+S), KOI is designed for contexts where knowledge needs to be contextual, traceable, and machine-readable without forcing everyone into the same database or governance model.

    KOI allows different groups to organise, reference, and share knowledge in a modular, consent-based way. It enables interoperability without centralisation, creating a shared architecture for collective intelligence while preserving local control.

  • PolicyKit: This is software for online communities to govern themselves. It lets communities create and enforce their own rules and decision-making processes.

  • Govbase: An open-source, crowd-sourced database of online governance projects, tools, organizations, and concepts.

  • Collective Voice: A project to integrate Metagov with Open Collective, exploring how collective governance can work with the financial practices of online communities.

  • Interop1: An initiative that aims to create a more interoperable ecosystem for online deliberation and funds open-source tools for deliberation and digital governance.

  • […]

4.2 Permissionless infrastructure

A different axis of variation: not which mechanism aggregates input but whose substrate the platform runs on. The blockchain-adjacent communities have a distinctive value stack — open global participation, censorship resistance, credible neutrality, auditability — which they treat as preconditions for anything we’d call governance, rather than as features.

Zuzalu introduced me to Zuzalu.city and, in turn, to Make Ethereum Cypherpunk Again:

Many of these values are shared not just by many in the Ethereum community, but also by other blockchain communities, and even non-blockchain decentralization communities, though each community has its own unique combination of these values and how much each one is emphasized.

  • Open global participation: anyone in the world should be able to participate as a user, observer or developer, on a maximally equal footing. Participation should be permissionless.
  • Decentralization: minimize the dependence of an application on any one single actor. In particular, an application should continue working even if its core developers disappear forever.
  • Censorship resistance: centralized actors should not have the power to interfere with any given user’s or application’s ability to operate. Concerns around bad actors should be addressed at higher layers of the stack.
  • Auditability: anyone should be able to validate an application’s logic and its ongoing operation (eg. by running a full node) to make sure that it is operating according to the rules that its developers claim it is.
  • Credible neutrality: base-layer infrastructure should be neutral, and in such a way that anyone can see that it is neutral even if they do not already trust the developers.
  • Building tools, not empires. Empires try to capture and trap the user inside a walled garden; tools do their task but otherwise interoperate with a wider open ecosystem.
  • Cooperative mindset: even while competing, projects within the ecosystem cooperate on shared software libraries, research, security, community building and other areas that are commonly valuable to them. Projects try to be positive-sum, both with each other and with the wider world.

5 Incoming

6 References

Acemoglu, and Ozdaglar. 2011. Opinion Dynamics and Learning in Social Networks.” Dynamic Games and Applications.
Allen-Zhu, and Xu. 2025. DOGE: Reforming AI Conferences and Towards a Future Civilization of Fairness and Justice.” SSRN Scholarly Paper.
Almaatouq, Rahimian, Burton, et al. 2021. When Social Influence Promotes the Wisdom of Crowds.” arXiv:2006.12471 [Physics, Stat].
Arguedas, Robertson, Fletche, et al. 2022. Echo Chambers, Filter Bubbles, and Polarisation: A Literature Review.”
Atanasov, Rescober, Stone, et al. 2015. Distilling the Wisdom of Crowds: Prediction Markets Versus Prediction Polls.” Academy of Management Proceedings.
Baron. 2005. So Right It’s Wrong: Groupthink and the Ubiquitous Nature of Polarized Group Decision Making.” In Advances in Experimental Social Psychology.
Burton, Lopez-Lopez, Hechtlinger, et al. 2024. How Large Language Models Can Reshape Collective Intelligence.” Nature Human Behaviour.
Conitzer, Freedman, Heitzig, et al. 2024. Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback.” In Proceedings of the 41st International Conference on Machine Learning. ICML’24.
Dai, and Fleisig. 2024. Mapping Social Choice Theory to RLHF.” In.
Danan, Gajdos, Hill, et al. 2016. Robust Social Decisions.” American Economic Review.
Drolsbach, Solovev, and Pröllochs. 2024. Community Notes Increase Trust in Fact-Checking on Social Media.” Edited by David Rand. PNAS Nexus.
Farrell, and Shalizi. 2015. Pursuing Cognitive Democracy.” From Voice to Influence: Understanding Citizenship in a Digital Age; Allen, D., Light, J., Eds.
Fish, Gölz, Parkes, et al. 2025. Generative Social Choice.”
Golub, and Jackson. 2012. How Homophily Affects the Speed of Learning and Best-Response Dynamics.” The Quarterly Journal of Economics.
Goyal, Chang, and Terry. 2024. Designing for Human-Agent Alignment: Understanding What Humans Want from Their Agents.” In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems.
Greenwald, and Stiglitz. 1986. Externalities in Economies with Imperfect Information and Incomplete Markets.” The Quarterly Journal of Economics.
Gudiño, Grandi, and Hidalgo. 2024. Large Language Models (LLMs) as Agents for Augmented Democracy.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
Hernández. 2025. Towards Automating Deliberation? The Idea of Deliberative Democracy Embedded in Google’s Habermas Machine.” Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society.
Hertz, Romand-Monnier, Kyriakopoulou, et al. 2016. Social influence protects collective decision making from equality bias.” Journal of Experimental Psychology. Human Perception and Performance.
Kasirzadeh, and Gabriel. 2025. Characterizing AI Agents for Alignment and Governance.”
Lazar. 2024a. Lecture I: Governing the Algorithmic City.”
———. 2024b. Lecture II: Communicative Justice and the Distribution of Attention.”
List, and Goodin. 2001. Epistemic Democracy: Generalizing the Condorcet Jury Theorem.” Journal of Political Philosophy.
Lloyd, Nguyen, Levy, et al. 2025. Beyond Community Notes: A Framework for Understanding and Building Crowdsourced Context Systems.”
Lorenz. 2010. Heterogeneous Bounds of Confidence: Meet, Discuss and Find Consensus! Complexity.
Masuda, and Redner. 2011. “Can Partisan Voting Lead to Truth?” Journal of Statistical Mechanics: Theory and Experiment.
Mercier, and Claidière. 2021. Does Discussion Make Crowds Any Wiser? Cognition.
Navajas, Niella, Garbulsky, et al. 2018. Aggregated Knowledge from a Small Number of Debates Outperforms the Wisdom of Large Crowds.” Nature Human Behaviour.
Novelli, Argota Sánchez-Vaquerizo, Helbing, et al. 2025. A Replica for Our Democracies? On Using Digital Twins to Enhance Deliberative Democracy.” AI & SOCIETY.
Nyhan, and Titiunik. 2024. Public Opinion Alone Won’t Save Democracy.” Science.
O’Connor, and Wu. 2021. How Should We Promote Transient Diversity in Science?
Ovadya. 2023a. Reimagining Democracy for AI.” Journal of Democracy.
———. 2023b. ‘Generative CI’ Through Collective Response Systems.”
Ovadya, and Thorburn. 2023. Bridging Systems: Open Problems for Countering Destructive Divisiveness Across Ranking, Recommenders, and Governance.”
Ovadya, Thorburn, Redman, et al. 2024. Toward Democracy Levels for AI.” In.
Qiu, He, Chugh, et al. 2025. The Lock-in Hypothesis: Stagnation by Algorithm.” In.
Schneier, and Sanders. 2025. Rewiring Democracy: How AI Will Transform Our Politics, Government, and Citizenship. Strong Ideas.
Schrock. 2018. Civic Tech: Making Technology Work for People.
Seger, Ovadya, Siddarth, et al. 2023. Democratising AI: Multiple Meanings, Goals, and Methods.” In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. AIES ’23.
Shahidi, Rusak, Manning, et al. 2025. The Coasean Singularity? Demand, Supply, and Market Design with AI Agents.” In. Working Paper Series.
Shin, Floch, Rask, et al. 2024. A Systematic Analysis of Digital Tools for Citizen Participation.” Government Information Quarterly.
Slaughter, Peytavin, Ugander, et al. 2025. Community Notes Reduce Engagement with and Diffusion of False Information Online.” Proceedings of the National Academy of Sciences.
Sorensen, Mishra, Patel, et al. 2025. Value Profiles for Encoding Human Variation.”
Stelmakh, Rastogi, Shah, et al. 2020. A Large Scale Randomized Controlled Trial on Herding in Peer-Review Discussions.”
Suresh, Tseng, Young, et al. 2024. Participation in the Age of Foundation Models.” In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’24.
Tan, and Abramsky. 2022. Institutions under composition.”
Tessler, Bakker, Jarrett, et al. 2024. AI Can Help Humans Find Common Ground in Democratic Deliberation.” Science.
Tomašev, Franklin, Leibo, et al. 2025. Virtual Agent Economies.”
Trouche, Sander, and Mercier. 2014. Arguments, More Than Confidence, Explain the Good Performance of Reasoning Groups.” SSRN Scholarly Paper ID 2431710.
van den Steen. 2010. Culture Clash: The Costs and Benefits of Homogeneity.” Management Science.
Volpe. 2025. Toward an Artificial Deliberation? On Google DeepMind’s Habermas Machine.” Ethics and Information Technology.
Weisbuch, Deffuant, Amblard, et al. 2002. Meet, Discuss, and Segregate! Complexity.
Wojcik. 2018. Do Birds of a Feather Vote Together, or Is It Peer Influence? Political Research Quarterly.
Wojcik, Hilgard, Judd, et al. 2022. Birdwatch: Crowd Wisdom and Bridging Algorithms Can Inform Understanding and Reduce the Spread of Misinformation.”
Xu, and Dean. 2023. Decision-Aid or Controller? Steering Human Decision Makers with Algorithms.”
Yang, and Bachmann. 2025. Bridging Voting and Deliberation with Algorithms: Field Insights from vTaiwan and Kultur Komitee.”
Yang, Dailisan, Korecki, et al. 2024. LLM Voting: Human Choices and AI Collective Decision-Making.” Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society.
Young, Ehsan, Singh, et al. 2024. Participation Versus Scale: Tensions in the Practical Demands on Participatory AI.” First Monday.
Zerilli. 2025. A Citizen’s Guide to Artificial Intelligence.