The deep history of intelligence
Alignment at the omega point
2025-06-04 — 2025-09-23
Wherein the lineage of intelligence is traced through thermodynamics and prediction, and Chaisson’s energy‑rate‑density metric is invoked to link cells, cities and superorganisms.
Let’s reason backwards from the final destination of civilisation, if such a thing there be. What intelligences persist at the omega point? With what is superintelligence aligned in the big picture?
Various authors have tried to put modern AI developments in continuity with historical trends from less materially sophisticated societies, through more legible, compute-oriented societies, to some set of attractors at the end of history. Computational superorganisms. Singularities. Some authors even start before humans and roll it back to microbes or prebiotic systems.
All those models that wonder about big-picture trends in information processing I consider as models of intelligence in big history.
cf Empowerment, fitness and loss functions, …
1 From Cells to Superorganisms
OK, if we want this to work, we need to decide: What even is intelligence? If I want to discuss intelligence, imagine a clever individual—a human, an octopus, perhaps an AI. But what if we can formally identify intelligence as some property of superorganisms, or generic organized systems, one that has been evolving and scaling for billions of years?
Across fields like history, evolution, and systems ecology, we find visionaries or cranks (cranksionaries!) who suggest that the story of our universe is one of growing computational power, where “intelligence” is the capacity of a system to harness energy to process information and shape its environment. From such a perspective, a bacterial colony, a forest ecosystem, and a modern megacity might all be points on a continuum of intelligence. They are all systems that create and maintain order against the relentless tide of entropy. And that’s smart, y’know, dude.
I’m trying to work out if any of those ideas are useful to me in the sense of generating testable hypotheses, or useful for narrowing the probability space of evolving civilisation, etc.
2 The Thermodynamic Engine
At the most basic level, we can suppose intelligence runs on energy. Eric Chaisson’s work in cosmic evolution provides a putatively universal metric: energy rate density (\(\phi m\)), or the flow of energy per second per gram (Chaisson 2011). When you plot this value for various systems, a trend appears. Stars have a low \(\phi m\). Plants have a higher one. Animals, higher still. A modern human society, with its immense metabolism of fossil fuels and electricity, has an energy rate density orders of magnitude greater than any biological system that came before it. In their model, this is the thermodynamic signature of “complexity”. To build and maintain intricate structures, a system must channel vast amounts of energy through its mass.
This perspective resembles Hagens’ conception of the global economy as a mindless, energy-hungry “Superorganism” (Hagens 2020). Our civilisation’s structure is entirely dependent on a massive throughput of high-quality energy, which allows for the intricate division of labour and information technology that define our modern intelligence. The downside, Hagens argues, is that this superorganism is “growth constrained,” behaviourally locked into myopically maximizing its energy consumption.
3 The Physics of Prediction
But the link between energy and intelligence is deeper than mere fuel. A fundamental principle of thermodynamics dictates that to be energy-efficient, a system “must” be predictive. Work by Susanne Still and colleagues proposes a formal equivalence between thermodynamic waste (dissipated heat) and informational waste (Still et al. 2012) (c.f. Selection theorems). The information a system remembers about its past can be split into two kinds: a useful part that helps predict the future, and a useless part the authors call “nostalgia”. Their conclusion is that this useless, non-predictive information is directly proportional to the energy the system must dissipate as heat.
In this model, wasted energy is the physical cost of holding onto memories that don’t help a system anticipate what’s next. This is a thermodynamic imperative: any system that evolves toward maximal energy efficiency is, by physical law, evolving to become a better prediction machine. This seems connected to the thermodynamics of life and the statistical mechanics of statistics.
4 A Physics of Adaptation
The principles above suggest that efficient systems must be predictive. Can we roll that concept back a little further into history and think about the little engines of prediction from a physical perspective rather than a biological one? How did matter become predictive or start managing data in the first place? The most direct and physically grounded attempt, AFAICT, to answer this comes from the work of Jeremy England and colleagues. In England (2013), he argued that self-replication could be a thermodynamically favoured outcome, a highly effective way for matter to dissipate energy. His colleagues have generalized this to “dissipation-driven adaptation” (Perunov, Marsland, and England 2016), which I’ve tried to understand over at the thermodynamics of life notebook.
They make a formal argument derived from the principles of far-from-equilibrium statistical mechanics. Take any system of particles (like a soup of chemicals). Drive it with an external energy source (like sunlight or chemical fuel). Allow it to dissipate that energy as heat into a surrounding bath (like the ocean). England’s theoretical results suggest that over time, such a system will tend to spontaneously self-organize into structures that are exceptionally good at absorbing and dissipating work from the environment.
The idea is that certain life-like properties are extraordinarily good at dissipating energy. The main example is self-replication. A single, complex molecule can only absorb and dissipate so much energy. But a population of a billion such molecules, created through replication, can dissipate vastly more. (I confess my grasp of this part of the argument is extremely shaky; I have a statistician’s understanding of entropy, not a physicist’s.) England’s framework suggests that replication might be a thermodynamically favoured outcome—a particularly effective solution that matter “discovers” for dissipating energy. In this view, “adaptation” is a physical process that can occur even before Darwinian selection kicks in. The system’s structure physically adapts to the patterns of the energy driving it, becoming “well-adapted” in a purely physical sense.
The Perunov, Marsland, and England (2016) argument can connect us to something like intelligence because it connects physical adaptation to computation. The proposed link lies in the idea that the system’s physical structure becomes an implicit memory of the environmental forces that shaped it, something memory something something Turing complete. [TODO clarify]
A system that has self-organized to be good at absorbing a specific type of energy (e.g., light of a certain frequency, or mechanical shaking in a particular pattern) has physically changed its shape. That new shape is a durable record of the environmental conditions that drove its formation, in the way that the shape and orientation of a sand dune record the wind that has blown upon it. Now, if it can learn to predict and get better at configuring itself for what will happen next, perhaps it will get better at absorbing the next energetic input and dissipating it, and therefore predominate. If we squint at that, it looks like a motivation for sprouting a brain.
5 I predict therefore I am
If the laws of thermodynamics demand that efficient systems be predictive, is there a principle that explains why predictive systems exist at all? This is the territory of Karl Friston’s Free Energy Principle (FEP), a particularly ambitious—and notoriously difficult-to-parse—attempted unified theory of intelligence and perhaps life (Friston 2013).
The principle posits that any system that persists in a fluctuating world—from a single cell to a brain—must maintain a boundary between itself and its environment. Friston and colleagues call this the Markov blanket, which I believe means the same here as it does in probabilistic graphical models. To maintain this boundary and not dissolve back into environmental chaos, the system must act to keep its internal states within a narrow, predictable range. Further, it must minimize “surprise”—or more technically, an upper bound on surprise called “variational free energy”, which I think is meant in the same sense as in variational inference.
On one hand, the system acts on the world to make its sensory inputs match its expectations (this is autopoiesis, or self-creation). On the other hand, its internal states become a probabilistic model of the environment, constantly updating to make better predictions in a process of active inference.
An intelligent agent, in the Fristonian sense, is something like a system that embodies a predictive model of its environment and acts to make its own predictions come true.
Danger: the “free energy” in the FEP is an information-theoretic quantity, distinct from the thermodynamic free energy of physicists, which causes endless confusion. The debate continues over whether the FEP provides a new fundamental law of nature or a very powerful, but ultimately circular, “as-if” description of any system that manages to not fall apart.
6 Greetings, Fellow Calculating Machines
Now we can see how these pieces fit together. A system uses energy to run computations that implement an information life cycle, and the pressure for energy efficiency forces those computations to become predictive.
David Wolpert and Kyle Harper formalize this at the societal scale with their “Multiple Communicating Machines” (MCM) model (David H. Wolpert and Harper 2025). They treat a society as a computer where occupations and technologies are interacting computational units. A society’s “computational power” is its ability to process information about its environment to effectively extract energy. This creates the central feedback loop of deep history: harvesting more energy allows a society to support more complex computation (more specialized jobs), which in turn allows it to harvest energy more effectively. This is the superorganism’s version of active inference: using its collective intelligence to model its world and act upon that model to ensure its continued existence.
So in this view, the agricultural revolution was a new set of algorithms for manipulating ecosystems. The rise of the state was a new computational architecture for managing large populations. And as Blaise Agüera y Arcas and James Manyika argue, AI represents a new substrate for computation that is fundamentally changing our world. Their phase-based model is a bit less elegant (Arcas et al. 2024), but it might be “true” in some sense, so I’ll slot it in here.
7 Rise of the Superorganism
Many of us, myself included, think it’s reasonable to examine societies as “crude superorganisms” (Boyd and Richerson 1999). As societies grew and became more interconnected, they began to function like integrated, goal-directed entities.
Michael Wong and Stuart Bartlett argue that civilizations following this kind of superlinear scaling—where growth and energy demand accelerate—are on a trajectory towards “asymptotic burnout” (Wong and Bartlett 2022). The interval between crises that require major innovation shortens until the pace of innovation can’t keep up, leading to collapse. That’s their solution to the Fermi Paradox.
They propose two outs: wise civilizations might undergo a homeostatic awakening, in which they use their advanced intelligence to recognize this destructive trajectory and consciously shift their priorities from unbounded growth to long-term stability. Or they might roll the dice and become a Type III galaxy-spanning expansionist civilization, but we suspect this rarely succeeds; otherwise we’d see more of them.
The growth-based singularity is also amusingly similar to an idea proposed by Master’s supervisor, who found a similar result by curve-fitting economic data (Johansen and Sornette 2001). [TODO clarify] YMMV.
8 The Big Table
To clarify the differences, here’s the feature matrix (disclaimer: generated by AI)
Model / Framework | What is “Intelligence”? | Engine of Growth/Evolution | Formalism/Methodology | Scale/Domain |
---|---|---|---|---|
Still et al. (Thermo. of Prediction) | Predictive power; minimizing non-predictive “nostalgic” information. | Thermodynamic pressure for energetic efficiency. | Statistical Physics (Nonequilibrium Thermodynamics) | Universal (Any system with memory) |
Friston’s Free Energy Principle | Active Inference; minimizing variational free energy; embodying a predictive model. | An existential imperative to resist dissipation and minimize surprise. | Bayesian Statistics, Dynamical Systems Theory. | Universal (purportedly from cells to societies) |
England’s Dissipation-Driven Adaptation | “Physical Adaptation”; structures exceptionally good at absorbing and dissipating energy from their environment. | Thermodynamic pressure for systems to self-organize into states that increase entropy production. | Causal driver. The physical imperative to dissipate energy creates adaptive, life-like structures. | Non-Equilibrium Statistical Mechanics (Fluctuation Theorems) |
Chaisson’s Energy-Rate Density | Energy flow per unit mass (\(\phi m\)) as a measure of complexity. | Increasing capacity to capture and channel energy through a system. | Empirical Metric (erg/s/g) | Universal (Cosmos to Society) |
Shin et al. (Seshat) | Information-processing capacity (e.g., writing, bureaucracy, currency). | Phased growth: scaling in size creates pressure for informational innovations. | Statistical Analysis (PCA) of historical database. | Holocene Societies |
Wolpert & Harper’s MCM | “Computational power” - the ability to process information to achieve goals. | Co-evolutionary feedback loop between energy harvesting and computational capacity. | Formal Computational Model (new class of automata). | Universal (Cells to Societies) |
Hagens’ Superorganism | The emergent metabolic activity of the global economic system. | Maximization of financial surplus, tethered to energy and carbon flows. | Systems View (conceptual framework with empirical data). | Global Human Civilization |
Wong & Bartlett’s Burnout | A civilization’s trajectory of accelerating growth (superlinear scaling). | Positive feedback loops between population, innovation, and energy demand. | Dynamical Systems Theory (conceptual, based on scaling laws). | Planetary Civilizations |
Boyd & Richerson’s Superorganism | Cooperative functioning of large-scale societies. | Gene-culture co-evolution creating social instincts, managed by cultural “work-arounds”. | Evolutionary Theory (conceptual hypothesis). | Human Societies |
9 Understanding by building
Can we design open-ended intelligence?
10 Biological anchors
Ajeya Cotra attempts to anchor modern synthetic intelligence by comparing it to the compute that nature historically used.
11 Incoming
Deacon (2012): [TODO clarify]
Incomplete Nature begins by accepting what other theories try to deny: that, although mental contents do indeed lack these material-energetic properties, they are still entirely products of physical processes and have an unprecedented kind of causal power that is unlike anything that physics and chemistry alone have so far explained. Paradoxically, it is the intrinsic incompleteness of these semiotic and teleological phenomena that is the source of their unique form of physical influence in the world. Incomplete Nature meticulously traces the emergence of this special causal capacity from simple thermodynamics to self-organizing dynamics to living and mental dynamics, and it demonstrates how specific absences (or constraints) play the critical causal role in the organization of physical processes that generate these properties.
PIBBSS – Principles of Intelligent Behavior in Biological and Social Systems
Prediction: Life will turn out to be everywhere (after a certain point)
Blaise Agüera y Arcas and James Manyika: AI Is Evolving — And Changing Our Understanding Of Intelligence
In this essay, we will describe five interrelated paradigm shifts informing our development of AI:
- Natural Computing — Computing existed in nature long before we built the first “artificial computers”. Understanding computing as a natural phenomenon will enable fundamental advances not only in computer science and AI but also in physics and biology.
- Neural Computing — Our brains are an exquisite instance of natural computing. Redesigning the computers that power AI so they work more like a brain will greatly increase AI’s energy efficiency—and its capabilities too.
- Predictive Intelligence — The success of large language models (LLMs) shows us something fundamental about the nature of intelligence: it involves statistical modeling of the future (including one’s own future actions) given evolving knowledge, observations and feedback from the past. This insight suggests that current distinctions between designing, training and running AI models are transitory; more sophisticated AI will evolve, grow and learn continuously and interactively, as we do.
- General Intelligence — Intelligence does not necessarily require biologically based computation. Although AI models will continue to improve, they are already broadly capable, tackling an increasing range of cognitive tasks with a skill level approaching and, in some cases, exceeding individual human capability. In this sense, “Artificial General Intelligence” (AGI) may already be here—we just keep shifting the goalposts.
- Collective Intelligence — Brains, AI agents and societies can all become more capable through increased scale. However, size alone is not enough. Intelligence is fundamentally social, powered by cooperation and the division of labor among many agents. In addition to causing us to rethink the nature of human (or “more than human”) intelligence, this insight suggests social aggregations of intelligences and multi-agent approaches to AI development that could reduce computational costs, increase AI heterogeneity and reframe AI safety debates.
Ecologies of Minds looks at the distinction between evolutionary and optimizing minds.
What should we call this? Factome, conceptome, empitome, noosphere…?
Ian Morris on whether deep history suggests we’re heading for an intelligence explosion
Do we need computers to create AIs at all, or are we all already AIs?
Publications - Center for the Study of Apparent Selves (CSAS)