AI disempowerment of humans

Races to the bottom in human relevance, gradual disempowerment

2021-09-20 — 2026-04-09

Wherein the process of gradual disempowerment — driven by feedback loops and competitive selection rather than sudden catastrophe — is traced across economic, epistemic, and political domains.

AI safety
economics
faster pussycat
innovation
language
machine learning
mind
neural nets
technology
UI
Figure 1

On the prospect of human domestication via AI. Is that bad? It seems frightening, but I suppose we could imagine virtuous versions of it.

The iconic essay in this domain is Paul Christiano’s 2019 essay, What failure looks like:

Amongst the broader population, many folk already have a vague picture of the overall trajectory of the world and a vague sense that something has gone wrong. There may be significant populist pushes for reform, but in general these won’t be well-directed. Some states may really put on the brakes, but they will rapidly fall behind economically and militarily, and indeed “appear to be prosperous” is one of the easily-measured goals for which the incomprehensible system is optimising.

Amongst intellectual elites there will be genuine ambiguity and uncertainty about whether the current state of affairs is good or bad. People really will be getting richer for a while. Over the short term, the forces gradually wresting control from humans do not look so different from (e.g.) corporate lobbying against the public interest, or principal-agent problems in human institutions. There will be legitimate arguments about whether the implicit long-term purposes being pursued by AI systems are really so much worse than the long-term purposes that would be pursued by the shareholders of public companies or corrupt officials.

We might describe the result as “going out with a whimper.” Human reasoning gradually stops being able to compete with sophisticated, systematised manipulation and deception which is continuously improving by trial and error; human control over levers of power gradually becomes less and less effective; we ultimately lose any real ability to influence our society’s trajectory.

See Kulveit et al. (2025) for a modernisation, and accompanying web page

People talk about “gradual” disempowerment to distinguish it from sudden apocalypse, but I see no reason for such disempowerment to be all that gradual. Both AI-led economic transition and epistemic transition can have rather sudden effects.

We can distinguish a few major lenses for analysing such a drift away from human agency. The authors use different metaphors (“the narrow corridor,” “lock-in,” “cultural feedback loops”), but they seem to be circling the same underlying worry: that once AI systems substitute for human labour, cognition, and culture, the implicit bargains that kept societies responsive to human needs will dissolve.

What failure looks like was the first post I can recall that claimed we could “go out with a whimper” as systems gradually optimise for goals misaligned with human flourishing, eroding our control without triggering apocalypse. That essay became the intellectual seed for much of what followed, inspiring formalisations like Kulveit et al. (2025)’s ‘gradual disempowerment’ report.

Most of the newer papers (Bullock, Hammond, and Krier 2025; Kulveit et al. 2025; MacInnes, Garfinkel, and Dafoe 2024; Qiu et al. 2025) can be read as elaborations of some common core ideas:

Each author picks a different domain lens to illustrate how this skeleton plays out.

The LessWrong and Alignment Forum essays (e.g. Multipolar Failure and Critch’s Boundaries) give the informal language — RAAPs, lock-in, multipolar traps — that the academic papers then mathematise. Andrew Critch’s GradualDisempowerment.ai portal is explicitly meant to bridge the informal and the formal. Meanwhile, popular pieces like Spirals of Delusion in Foreign Affairs translate the motif into a geopolitical idiom.

What unifies them is a shift from “sudden catastrophic takeover” to “systemic drift” as the object of risk analysis.

1 Lock-in models

Qiu et al. (2025)

2 Incoming

3 References

Acemoglu. 2020. The Narrow Corridor: States, Societies, and the Fate of Liberty.
Arthur. 1989. Competing Technologies, Increasing Returns, and Lock-In by Historical Events.” The Economic Journal.
Barez, Friend, Reid, et al. 2025. Toward Resisting AI-Enabled Authoritarianism.”
Bullock, Hammond, and Krier. 2025. AGI, Governments, and Free Societies.”
Carlsmith. 2024. Is Power-Seeking AI an Existential Risk?
Chu, Rule, Goddu, et al. 2025. Fun Isn’t Easy: Children Selectively Manipulate Task Difficulty When “Playing for Fun” Versus “Playing to Win”.” Developmental Psychology.
Critch, Dennis, and Russell. 2022. Cooperative and Uncooperative Institution Designs: Surprises and Problems in Open-Source Game Theory.”
Dafoe. 2019. Value Erosion for FHI July 2019.”
Farrell, Gopnik, Shalizi, et al. 2025. Large AI Models Are Cultural and Social Technologies.” Science.
Fish, Gölz, Parkes, et al. 2025. Generative Social Choice.”
Heitzig, and Potham. 2025. Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power.”
Kenton, Kumar, Farquhar, et al. 2023. Discovering Agents.” Artificial Intelligence.
Kulveit, Douglas, Ammann, et al. 2025. Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development.”
Loewith, and Street. 2025. Mutual Prediction in Human–AI Coevolution.” Antikythera Digital Journal.
MacInnes, Garfinkel, and Dafoe. 2024. Anarchy as Architect: Competitive Pressure, Technology, and the Internal Structure of States.” International Studies Quarterly.
Pang. 2026. “On the Markovian Dynamics of Computational Systems.”
Patell. 2025. Cooperation as Bulwark: Evolutionary Game Theory and the Internal Institutional Structure of States.”
Qiu, He, Chugh, et al. 2025. The Lock-in Hypothesis: Stagnation by Algorithm.” In.
Sturgeon, Samuelson, Hyams, et al. 2025. HumanAgencyBench: Do Language Models Support Human Agency? In.
Tarsney. 2025. Will Artificial Agents Pursue Power by Default?
Turner, Smith, Shah, et al. 2021. Optimal Policies Tend To Seek Power.” In Advances in Neural Information Processing Systems.
Zhu, Lu, Ming, et al. 2025. Designing Meaningful Human Oversight in AI.” SSRN Scholarly Paper.