The deep history of intelligence

Alignment at the omega point

2025-06-04 — 2025-10-27

Wherein the long arc of cognition is traced through physics, and Chaisson’s energy‑rate‑density metric is invoked to show how energy throughput is linked to evolving predictive capacity.

adversarial

AI safety

catastrophe

economics

faster pussycat

innovation

language

machine learning

mind

neural nets

NLP

security

technology

Let’s reason backwards from the final destination of civilisation, if such a thing there be. What intelligences persist at the omega point? With what is superintelligence aligned in the big picture?

Various authors have tried to place modern AI developments in continuity with historical trends from less materially sophisticated societies, through more legible, compute-oriented societies, to some set of attractors at the end of history. Computational superorganisms. Singularities. Some authors even start before humans and roll the story back to microbes or prebiotic systems.

All those models that speculate about big-picture trends in information processing I consider as models of intelligence in big history.

cf Empowerment, fitness and loss functions, …

1 From Cells to Superorganisms

OK, if we want this to work, we need to decide: What even is intelligence? We can imagine intelligence as that which is possessed by a clever individual—a human, an octopus, perhaps an AI. Can we formally identify intelligence as some property of superorganisms, or generic organized systems? Societies that have evolved for thousands of years? Life that has been evolving and scaling for billions of years?

Across fields like history, evolution, and systems ecology, we find visionaries and cranks (cranksionaries!) who suggest that the story of our universe is one of growing computational power, where “intelligence” is the capacity of a system to harness energy to process information and shape its environment. From such a perspective, a bacterial colony, a forest ecosystem, and a modern megacity might all be points on a continuum of intelligence. They are all systems that create and maintain order against the relentless tide of entropy. And that’s, y’know, smart, dude.

I’m trying to work out whether any of those ideas are useful for generating testable hypotheses or for narrowing the probability space of evolving civilization, etc.

2 The Thermodynamic Engine

At the most basic level, we can suppose intelligence runs on energy. Eric Chaisson’s work in cosmic evolution provides a putatively universal metric: energy rate density (\(\phi m\)), or the flow of energy per second per gram (Chaisson 2011). When we plot this value for various systems, a trend appears. Stars have a low \(\phi m\). Plants have a higher value. Animals are higher still. A modern human society, with its immense metabolism of fossil fuels and electricity, has an energy rate density orders of magnitude greater than any biological system that came before it. In their model, this is the thermodynamic signature of “complexity”. To build and maintain intricate structures, a system must channel vast amounts of energy through its mass.

This perspective resembles Hagens’ conception of the global economy as a mindless, energy-hungry “Superorganism” (Hagens 2020). Our civilization’s structure is entirely dependent on a massive throughput of high-quality energy, which allows for the intricate division of labour and information technology that define our modern intelligence. The downside, Hagens argues, is that this superorganism is “growth constrained,” behaviourally locked into myopically maximizing its energy consumption.

3 The Physics of Prediction

But the link between energy and intelligence is deeper than mere fuel. A fundamental principle of thermodynamics dictates that to be energy-efficient, a system “must” be predictive. Work by Susanne Still and colleagues proposes a formal equivalence between thermodynamic waste (dissipated heat) and informational waste (Still et al. 2012) (c.f. Selection theorems). The information a system remembers about its past can be split into two kinds: a useful part that helps predict the future, and a useless part the authors call “nostalgia”. Their conclusion is that this useless, non-predictive information is directly proportional to the energy the system must dissipate as heat.

In this model, wasted energy is the physical cost of holding onto memories that don’t help a system anticipate what’s next. This is a thermodynamic imperative: any system that evolves towards maximal energy efficiency is, by physical law, evolving to become a better prediction machine. This seems connected to the thermodynamics of life and the statistical mechanics of statistics.

4 A Physics of Adaptation

The principles above suggest that efficient systems must be predictive. Can we roll that concept back further into history and think about the little engines of prediction from a physical perspective rather than a biological one? How did matter become predictive or start managing data in the first place? The most direct and physically grounded attempt, AFAICT, to answer this comes from the work of Jeremy England and colleagues. In England (2013), he argued that self-replication could be a thermodynamically favoured outcome, a highly effective way for matter to dissipate energy. His colleagues have generalized this to “dissipation-driven adaptation” (Perunov, Marsland, and England 2016), which I’ve tried to understand over at the thermodynamics of life notebook.

They make a formal argument derived from the principles of far-from-equilibrium statistical mechanics. Take any system of particles (like a soup of chemicals). Drive it with an external energy source (like sunlight or chemical fuel). Allow it to dissipate that energy as heat into a surrounding bath (like the ocean). England’s theoretical results suggest that over time, such a system will tend to spontaneously self-organize into structures that are exceptionally good at absorbing and dissipating work from the environment.

The idea is that certain life-like properties are extraordinarily good at dissipating energy. The main example is self-replication. A single, complex molecule can only absorb and dissipate so much energy. But a population of a billion such molecules, created through replication, can dissipate vastly more. (I confess my grasp of this part of the argument is extremely shaky; I have a statistician’s understanding of entropy, not a physicist’s.) England’s framework suggests that replication might be a thermodynamically favoured outcome—a particularly effective solution that matter “discovers” for dissipating energy. In this view, “adaptation” is a physical process that can occur even before Darwinian selection kicks in. The system’s structure physically adapts to the patterns of the energy driving it, becoming “well-adapted” in a purely physical sense.

The Perunov, Marsland, and England (2016) argument connects physical adaptation to computation, which links it to something like intelligence. The proposed link is the idea that the system’s physical structure becomes an implicit memory of the environmental forces that shaped it, something memory something something Turing complete. [TODO clarify]

A system that has self-organized to be good at absorbing a specific type of energy (e.g., light of a certain frequency, or mechanical shaking in a particular pattern) has physically changed its shape. That new shape is a durable record of the environmental conditions that drove its formation, in the way that the shape and orientation of a sand dune record the wind that has blown upon it. Now, if it can learn to predict and configure itself better for what will happen next, it may get better at absorbing the next energetic input and dissipating it, and therefore predominate. If we squint at that, it looks like a motivation for sprouting a brain.

5 I predict therefore I am

If the laws of thermodynamics demand that efficient systems be predictive, is there a principle that explains why predictive systems exist at all? This is the territory of Karl Friston’s Free Energy Principle (FEP), a particularly ambitious—and notoriously difficult-to-parse—attempted unified theory of intelligence and perhaps life (Friston 2013).

The principle posits that any system that persists in a fluctuating world—from a single cell to a brain—must maintain a boundary between itself and its environment. Friston and colleagues call this the Markov blanket, which I believe means the same here as it does in probabilistic graphical models. To maintain this boundary and not dissolve back into environmental chaos, the system must act to keep its internal states within a narrow, predictable range. Further, it must minimize “surprise”—or more technically, an upper bound on surprise called “variational free energy”, which I think is meant in the same sense as in variational inference.

On one hand, the system acts on the world to make its sensory inputs match its expectations (this is autopoiesis, or self-creation). On the other hand, its internal states become a probabilistic model of the environment, constantly updating to make better predictions in a process of active inference.

An intelligent agent, in the Fristonian sense, is something like a system that embodies a predictive model of its environment and acts to make its own predictions come true.

Danger: the “free energy” in the FEP is an information-theoretic quantity, distinct from the thermodynamic free energy of physicists, which causes endless confusion. The debate continues over whether the FEP provides a new fundamental law of nature or a very powerful, but ultimately circular, “as-if” description of any system that manages to not fall apart.

6 Greetings, Fellow Calculating Machines

Now we can see how these pieces fit together. A system uses energy to run computations that implement an information life cycle, and the pressure for energy efficiency forces those computations to become predictive.

David Wolpert and Kyle Harper formalize this at the societal scale with their “Multiple Communicating Machines” (MCM) model (David H. Wolpert and Harper 2025). They treat a society as a computer where occupations and technologies are interacting computational units. A society’s “computational power” is its ability to process information about its environment to effectively extract energy. This creates the central feedback loop of deep history: harvesting more energy allows a society to support more complex computation (more specialized jobs), which in turn allows it to harvest energy more effectively. This is the superorganism’s version of active inference: using its collective intelligence to model its world and act upon that model to ensure its continued existence.

So in this view, the agricultural revolution was a new set of algorithms for manipulating ecosystems. The rise of the state was a new computational architecture for managing large populations. And as Blaise Agüera y Arcas and James Manyika argue, AI represents a new substrate for computation that is fundamentally changing our world. Their phase-based model is a bit less elegant (Arcas et al. 2024), but it might be “true” in some sense, so I’ll slot it in here.

7 Rise of the Superorganism

Many of us — myself included — think it’s reasonable to examine societies as “crude superorganisms” (Boyd and Richerson 1999).

As societies grew and became more interconnected, they started functioning as integrated, goal-directed entities.

Michael Wong and Stuart Bartlett argue that civilizations that follow this kind of superlinear scaling—where growth and energy demand accelerate—are on a trajectory towards “asymptotic burnout” (Wong and Bartlett 2022). The interval between crises that require major innovation shortens until the pace of innovation can’t keep up, leading to collapse. That’s their solution to the Fermi Paradox.

The authors propose two outs: wise civilizations might undergo a homeostatic awakening, where they use advanced intelligence to recognise a destructive trajectory and consciously shift priorities from unbounded growth to long-term stability. Or they might roll the dice and become a Type III galaxy-spanning expansionist civilisation, but we suspect this rarely succeeds; otherwise we’d see more of them.

The growth-based singularity is also amusingly similar to an idea¹ proposed by my Master’s supervisor, Didier Sornette, who found a similar result by curve-fitting economic data (Johansen and Sornette 2001). YMMV.

8 The Big Table

To clarify the differences, here’s the feature matrix (disclaimer: generated by AI)

Model / Framework	What is “Intelligence”?	Engine of Growth/Evolution	Formalism/Methodology	Scale/Domain
Still et al. (Thermo. of Prediction)	Predictive power; minimizing non-predictive “nostalgic” information.	Thermodynamic pressure for energetic efficiency.	Statistical Physics (Nonequilibrium Thermodynamics)	Universal (Any system with memory)
Friston’s Free Energy Principle	Active Inference; minimizing variational free energy; embodying a predictive model.	An existential imperative to resist dissipation and minimize surprise.	Bayesian Statistics, Dynamical Systems Theory.	Universal (purportedly from cells to societies)
England’s Dissipation-Driven Adaptation	“Physical Adaptation”; structures exceptionally good at absorbing and dissipating energy from their environment.	Thermodynamic pressure for systems to self-organize into states that increase entropy production.	Causal driver: The physical imperative to dissipate energy creates adaptive, life-like structures.	Non-Equilibrium Statistical Mechanics (Fluctuation Theorems)
Chaisson’s Energy-Rate Density	Energy flow per unit mass (\(\phi m\)) as a measure of complexity.	Increasing capacity to capture and channel energy through a system.	Empirical Metric (erg/s/g)	Universal (Cosmos to Society)
Shin et al. (Seshat)	Information-processing capacity (e.g., writing, bureaucracy, currency).	Phased growth: scaling in size creates pressure for informational innovations.	Statistical Analysis (PCA) of historical database.	Holocene Societies
Wolpert & Harper’s MCM	“Computational power” - the ability to process information to achieve goals.	Co-evolutionary feedback loop between energy harvesting and computational capacity.	Formal Computational Model (new class of automata).	Universal (Cells to Societies)
Hagens’ Superorganism	The emergent metabolic activity of the global economic system.	Maximisation of financial surplus, tethered to energy and carbon flows.	Systems View (conceptual framework with empirical data).	Global Human Civilization
Wong & Bartlett’s Burnout	A civilization’s trajectory of accelerating growth (superlinear scaling).	Positive feedback loops between population, innovation, and energy demand.	Dynamical Systems Theory (conceptual, based on scaling laws).	Planetary Civilizations
Boyd & Richerson’s Superorganism	Cooperative functioning of large-scale societies.	Gene-culture co-evolution creating social instincts, managed by cultural “work-arounds”.	Evolutionary Theory (conceptual hypothesis).	Human Societies

9 Understanding by building

Can we design open-ended intelligence?

10 Biological anchors

Ajeya Cotra attempts to anchor modern synthetic intelligence by comparing it to the compute that nature historically used.

11 As a design problem

Start here:

Better Futures: Making Good Futures Even Better

12 Incoming

Deacon (2012):

Incomplete Nature begins by accepting what other theories try to deny: that, although mental contents do indeed lack these material-energetic properties, they are still entirely products of physical processes and have an unprecedented kind of causal power that is unlike anything that physics and chemistry alone have so far explained. Paradoxically, it is the intrinsic incompleteness of these semiotic and teleological phenomena that is the source of their unique form of physical influence in the world. Incomplete Nature meticulously traces the emergence of this special causal capacity from simple thermodynamics to self-organizing dynamics to living and mental dynamics, and it demonstrates how specific absences (or constraints) play the critical causal role in the organization of physical processes that generate these properties.

I had this recommended to me at a conference recently, but it looks unnecessarily baffling, and I worry I’ll spend days trying to figure out what the author means by “incompleteness” only to find it’s kinda bullshit.
I already got burned once by Maturana (1992), which looked like it might be useful, but I strongly suspect it’s a failed project in the sense of Newton’s flaming laser sword
PIBBSS – Principles of Intelligent Behavior in Biological and Social Systems
Prediction: Life will turn out to be everywhere (after a certain point)
The Second Law of Thermodynamics, and Engines of Cognition
Paths To High-Level Machine Intelligence
Blaise Agüera y Arcas and James Manyika: AI Is Evolving — And Changing Our Understanding Of Intelligence
In this essay, we will describe five interrelated paradigm shifts informing our development of AI:
1. Natural Computing — Computing existed in nature long before we built the first “artificial computers”. Understanding computing as a natural phenomenon will enable fundamental advances not only in computer science and AI but also in physics and biology.
2. Neural Computing — Our brains are an exquisite instance of natural computing. Redesigning the computers that power AI so they work more like a brain will greatly increase AI’s energy efficiency—and its capabilities too.
3. Predictive Intelligence — The success of large language models (LLMs) shows us something fundamental about the nature of intelligence: it involves statistical modeling of the future (including one’s own future actions) given evolving knowledge, observations and feedback from the past. This insight suggests that current distinctions between designing, training and running AI models are transitory; more sophisticated AI will evolve, grow and learn continuously and interactively, as we do.
4. General Intelligence — Intelligence does not necessarily require biologically based computation. Although AI models will continue to improve, they are already broadly capable, tackling an increasing range of cognitive tasks with a skill level approaching and, in some cases, exceeding individual human capability. In this sense, “Artificial General Intelligence” (AGI) may already be here—we just keep shifting the goalposts.
5. Collective Intelligence — Brains, AI agents and societies can all become more capable through increased scale. However, size alone is not enough. Intelligence is fundamentally social, powered by cooperation and the division of labor among many agents. In addition to causing us to rethink the nature of human (or “more than human”) intelligence, this insight suggests social aggregations of intelligences and multi-agent approaches to AI development that could reduce computational costs, increase AI heterogeneity and reframe AI safety debates.
Joscha Bach on Synthetic Intelligence / EA Forum annotation
Ecologies of Minds looks at the distinction between evolutionary and optimizing minds.
What should we call this? Factome, conceptome, empitome, noosphere…?
Ian Morris on whether deep history suggests we’re heading for an intelligence explosion
Deep Atheism and AI Risk — Joe Carlsmith
Do we need computers to create AIs at all, or are we all already AIs?
Paradigms of Intelligence Team
Human values and biases are inaccessible to the genome
Empowerment is (almost) all we need
Publications - Center for the Study of Apparent Selves (CSAS)

13 References

Abramsky, Banzhaf, Caves, et al. 2025. “Open Questions about Time and Self-Reference in Living Systems.”

Aktipis. 2016. “Principles of Cooperation Across Systems: From Human Sharing to Multicellularity and Cancer.” Evolutionary Applications.

Arcas, Alakuijala, Evans, et al. 2024. “Computational Life: How Well-Formed, Self-Replicating Programs Emerge from Simple Interaction.”

Axelrod, Robert M. 1984. The evolution of cooperation.

Axelrod, Robert, and Hamilton. 1981. “The Evolution of Cooperation.” Science, New Series,.

Beaulieu, Frati, Miconi, et al. 2020. “Learning to Continually Learn.”

Best, and Kellner. 1999. “Kevin Kelly’s Complexity Theory: The Politics and Ideology of Self-Organizing Systems.” Organization & Environment.

Bowles, Choi, and Hopfensitz. 2003. “The Co-Evolution of Individual Behaviors and Social Institutions.” Journal of Theoretical Biology.

Boyd, and Richerson. 1999. “Complex Societies: The Evolutionary Origins of a Crude Superorganism.” Human Nature.

Boyd, and Richerson. 2005. The Origin and Evolution of Cultures. Evolution and Cognition.

Chaisson. 2011. “Energy Rate Density as a Complexity Metric and Evolutionary Driver.” Complexity.

Chapman, Childers, and Vallino. 2016. “How the Second Law of Thermodynamics Has Informed Ecosystem Ecology Through Its History.” BioScience.

Clune. 2020. “AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence.”

Cully, Clune, Tarapore, et al. 2015. “Robots that can adapt like animals.” Nature.

Dafoe, Hughes, Bachrach, et al. 2020. “Open Problems in Cooperative AI.”

Deacon. 2012. Incomplete Nature: How Mind Emerged from Matter.

Ecoffet, Huizinga, Lehman, et al. 2021. “First Return, Then Explore.” Nature.

Egel. 2012. “Life’s Order, Complexity, Organization, and Its Thermodynamic–Holistic Imperatives.” Life : Open Access Journal.

England. 2013. “Statistical Physics of Self-Replication.” The Journal of Chemical Physics.

Faldor, Zhang, Cully, et al. 2024. “OMNI-EPIC: Open-Endedness via Models of Human Notions of Interestingness with Environments Programmed in Code.” In.

Fletcher, and Zwick. 2007. “The evolution of altruism: game theory in multilevel selection and inclusive fitness.” Journal of Theoretical Biology.

Friston. 2013. “Life as We Know It.” Journal of The Royal Society Interface.

Galesic, Barkoczi, Berdahl, et al. 2022. “Beyond Collective Intelligence: Collective Adaptation.”

Hagens. 2020. “Economics for the Future – Beyond the Superorganism.” Ecological Economics.

Harari. 2018. Homo Deus: A Brief History of Tomorrow.

Hetzer, and Sornette. 2009. “Other-Regarding Preferences and Altruistic Punishment: A Darwinian Perspective.” SSRN Scholarly Paper ID 1468517.

Hoffman, and Prakash. 2014. “Objects of consciousness.” Frontiers in Psychology.

Johansen, and Sornette. 2001. “Finite-Time Singularity in the Dynamics of the World Population, Economic and Financial Indices.” Physica A: Statistical Mechanics and Its Applications.

Kauffman, Stuart A. 1993. The Origins of Order: Self-Organization and Selection in Evolution.

Kauffman, Stuart A. 1996. At Home in the Universe: The Search for the Laws of Self-Organization and Complexity.

Lang, Fisher, Mora, et al. 2014. “Thermodynamics of Statistical Inference by Cells.” Physical Review Letters.

Levin. 2024. “Artificial Intelligences: A Bridge Toward Diverse Intelligence and Humanity’s Future.” Advanced Intelligent Systems.

Marsland, and England. 2018. “Limits of Predictions in Thermodynamic Systems: A Review.” Reports on Progress in Physics.

Maturana. 1992. The Tree Of Knowledge: The Biological Roots of Human Understanding.

Mercier, and Sperber. 2011. “Why Do Humans Reason? Arguments for an Argumentative Theory.” Behavioral and Brain Sciences.

———. 2017. The Enigma of Reason.

Mesoudi, and Whiten. 2008. “The Multiple Roles of Cultural Transmission Experiments in Understanding Human Cultural Evolution.” Philosophical Transactions of the Royal Society B: Biological Sciences.

Meulemans, Kobayashi, Oswald, et al. 2024. “Multi-Agent Cooperation Through Learning-Aware Policy Gradients.” In.

Morris. 2014. The Measure of Civilization: How Social Development Decides the Fate of Nations.

Muthukrishna. 2023. A Theory of Everyone: The New Science of Who We Are, How We Got Here, and Where We’re Going.

Nowak. 2006. “Five Rules for the Evolution of Cooperation.” Science.

Omohundro. 2008. “The Basic AI Drives.” In Proceedings of the 2008 Conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference.

Perunov, Marsland, and England. 2016. “Statistical Physics of Adaptation.” Physical Review X.

Prakash, Stephens, Hoffman, et al. 2021. “Fitness Beats Truth in the Evolution of Perception.” Acta Biotheoretica.

Ringstrom. 2022. “Reward Is Not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning.”

Schneider, and Kay. 1994. “Life as a Manifestation of the Second Law of Thermodynamics.” Mathematical and Computer Modelling.

Shin, Price, Wolpert, et al. 2020. “Scale and Information-Processing Thresholds in Holocene Social Evolution.” Nature Communications.

Sornette. 2003. “Critical Market Crashes.” Physics Reports.

Still, Sivak, Bell, et al. 2012. “Thermodynamics of Prediction.” Physical Review Letters.

Suki. 2012. “The Major Transitions of Life from a Network Perspective.” Frontiers in Physiology.

Thagard. 1997. “Collaborative Knowledge.” Noûs.

Wang, Lehman, Clune, et al. 2019. “POET: Open-Ended Coevolution of Environments and Their Optimized Solutions.” In Proceedings of the Genetic and Evolutionary Computation Conference. GECCO ’19.

Wang, Lehman, Rawal, et al. 2020. “Enhanced POET: Open-Ended Reinforcement Learning Through Unbounded Invention of Learning Challenges and Their Solutions.”

Wolpert, David H. 2006. “Information Theory — The Bridge Connecting Bounded Rational Game Theory and Statistical Physics.” In Complex Engineered Systems. Understanding Complex Systems.

Wolpert, David H. 2008. “Physical Limits of Inference.” Physica D: Nonlinear Phenomena, Novel Computing Paradigms: Quo Vadis?,.

Wolpert, David. 2017. “Constraints on Physical Reality Arising from a Formalization of Knowledge.”

Wolpert, David H. 2018. “Theories of Knowledge and Theories of Everything.” In The Map and the Territory: Exploring the Foundations of Science, Thought and Reality.

———. 2019. “Stochastic Thermodynamics of Computation.”

Wolpert, David H, Bieniawski, and Rajnarayan. 2011. “Probability Collectives in Optimization.”

Wolpert, David H., and Harper. 2025. “The Computational Power of a Human Society: A New Model of Social Evolution.”

Wolpert, David H., and Tumer. 1999. “An Introduction to Collective Intelligence.” arXiv:cs/9908014.

Wong, and Bartlett. 2022. “Asymptotic Burnout and Homeostatic Awakening: A Possible Solution to the Fermi Paradox?” Journal of The Royal Society Interface.

Footnotes

[…]both the Earth’s human population and its economic output have grown faster than exponential, i.e., in a super-Malthusian mode, for most of the known history. These growth rates are compatible with a spontaneous singularity occurring at the same critical time 2052 ± 10, signalling an abrupt transition to a new regime.↩︎