The deep history of intelligence

Alignment at the omega point

2025-06-04 — 2025-08-30

adversarial
AI safety
catastrophe
economics
faster pussycat
innovation
language
machine learning
mind
neural nets
NLP
security
technology
Figure 1

Let’s reason backwards from the final destination of civilisation, if such a thing there be. What intelligences persist at the omega point? With what is superintelligence aligned in the big picture?

Various authors have tried to put modern AI developments in continuity with historical trends from less materially-sophisticated societies, through more legible, compute-oriented societies, to some or set of attractors at the end of history. Computational superorganisms. Singularities. Some authors even start before humans, and roll it back to microbes, or even pre-biotic systems.

All those models, that wonder about big-picture trends in information processing, I’m considering as models of intelligence in big history.

cf Empowerment.

1 From Cells to Superorganisms

OK, if we want to make that work we need to decide: What even is intelligence? If I want to discuss intelligence imagine a clever individual—a human, an octopus, perhaps an AI. But what if we can formally identify intelligence as some property of superoganisms, or generic organized systems, one that has been evolving and scaling for billions of years?

Across fields like history, evolution, and systems ecology, you find visionaries or cranks (cranksionaries!) who suggest that the story of our universe is one of growing computational power, where “intelligence” is the capacity of a system to harness energy to process information and shape its environment. From such a perspective, a bacterial colony, a forest ecosystem, and a modern megacity might all points on a continuum of intelligence. They are all systems that create and maintain order against the relentless tide of entropy. And that’s smart, y’know, dude.

I’m trying to work out if any of those ideas are useful to me in the sense generating testable hypothesis, or useful narrowing the probability space of evolving civilisation etc.

2 The Thermodynamic Engine

At the most basic level, we can suppose intelligence runs on energy. Eric Chaisson’s work in cosmic evolution provides a putatively universal metric: energy rate density (\(\phi m\)), or the flow of energy per second per gram (Chaisson 2011). When you plot this value for various systems, a trend appears. Stars have a low \(\phi m\). Plants have a higher one. Animals, higher still. A modern human society, with its immense metabolism of fossil fuels and electricity, has an energy rate density orders of magnitude greater than any biological system that came before it. In their model, this is the thermodynamic signature of “complexity”. To build and maintain intricate structures, a system must channel vast amounts of energy through its mass.

This perspective resembles Hagens’ conception of the global economy as a mindless, energy-hungry “Superorganism” (Hagens 2020). Our civilization’s structure is entirely dependent on a massive throughput of high-quality energy, which allows for the intricate division of labor and information technology that define our modern intelligence. The downside, Hagens argues, is that this superorganism is “growth constrained,” behaviourally locked into myopically maximizing its energy consumption.

3 The Physics of Prediction

But the link between energy and intelligence is deeper than just fuel. A fundamental principle of thermodynamics dictates that to be energy-efficient, a system must be predictive. Work by Susanne Still and colleagues establishes a formal equivalence between thermodynamic waste (dissipated heat) and informational waste (Still et al. 2012). The information a system remembers about its past can be split into two kinds: a useful part that helps predict the future, and a useless part they call “nostalgia”. Their conclusion is that this useless, non-predictive information is directly proportional to the energy the system must dissipate as heat.

In this model, wasted energy is the physical cost of holding onto memories that don’t help you anticipate what’s next. This is a thermodynamic imperative: any system that evolves toward maximal energy efficiency is, by physical law, evolving to become a better prediction machine. Sounds like this is one of the ones that connects to the thermodynamics of life and or the statistical mechanics of statistics.

4 A Physics of Adaptation

The principles above suggest that efficient systems must be predictive. Can we roll that concept back a little further into history and think about the little engines of prediction in physical perspective rather than biological perspective? How did matter become predictive or start managing data in the first place? The most direct and physically-grounded attempt AFAICT answer this comes from the work of Jeremy England and colleagues. In England (2013) he argued that self-replication itself could be a thermodynamically favored outcome, a highly effective way for matter to dissipate energy His colleagues have generalized this to “dissipation-driven adaptation” (Perunov, Marsland, and England 2016), which I have tried to undersyand ove at the thermodynamics of life notebook.

They make a formal argument derived from the principles of far-from-equilibrium statistical mechanics. Take any system of particles (like a soup of chemicals). Drive it with an external energy source (like sunlight or chemical fuel). Allow it to dissipate that energy as heat into a surrounding bath (like the ocean). England’s theoretical results suggest that over time, such a system will tend to spontaneously self-organize into structures that are exceptionally good at absorbing and dissipating work from the environment.

The idea is that certain life-like properties are extraordinarily good at dissipating energy. The main example is self-replication. A single, complex molecule can only absorb and dissipate so much energy. But a population of a billion such molecules, created through replication, can dissipate vastly more. (I confess my grasp of this part of the argument is extremely shaky; I have a statistician’s understanding of entropy not a physicist’s). England’s framework suggests that replication might be a thermodynamically favored outcome—a particularly effective solution that matter “discovers” for dissipating energy. In this view, “adaptation” is a physical process that can occur even before Darwinian selection kicks in. The system’s structure physically adapts to the patterns of the energy driving it, becoming “well-adapted” in a purely physical sense.

The Perunov, Marsland, and England (2016) argument can connect us to something like intelligence because it connects us to computation. The proposed link lies in the idea that the system’s physical structure becomes an implicit memory of the environmental forces that shaped it, something memory something something Turing complete.

A system that has self-organized to be good at absorbing a specific type of energy (e.g., light of a certain frequency, or mechanical shaking in a particular pattern) has physically changed its shape. That new shape is a durable record of the environmental conditions that drove its formation, in the way that the slop an orientation of a sand dune records the wind that has blow upon it. Now, if it can learn to predict and get better at configuring itself what what will happen next, perhaps it will get better at absorbing the next energetic input and dissipating it, and therefore predominate more. If we squint at that it looks like a motivation for sprouting a brain.

5 I predict therefore I am

If the laws of thermodynamics demand that efficient systems be predictive, is there a principle that explains why predictive systems exist at all? This is the territory of Karl Friston’s Free Energy Principle (FEP), a particularly ambitious and notoriously difficult-to-parse, attempted unified theory of intelligence and maybe life (Friston 2013).

The principle posits that any system that persists in a fluctuating world—from a single cell to a brain—must maintain a boundary between itself and its environment. They call this the Markov blanket, which I believe means the same here as it does in probabilistic graphical models. To maintain this boundary and not dissolve back into environmental chaos, the system must act to keep its internal states within a narrow, predictable range. Further, it must minimize “surprise”—or more technically, an upper bound on surprise called “variational free energy”, which I think is meant as the same sense as in variational inference.

On one hand, the system acts on the world to make its sensory inputs match its expectations (this is autopoiesis, or self-creation). On the other hand, its internal states become a probabilistic model of the environment, constantly updating to make better predictions in a process of active inference.

An intelligent agent, in the Fristonian sense, is something like a system that embodies a predictive model of its environment and acts to make its own predictions come true.

Danger: the ‘free energy’ in the FEP is an information-theoretic quantity, distinct from the thermodynamic free energy of physicists, which causes endless confusion. The debate continues over whether the FEP provides a new fundamental law of nature or a very powerful, but ultimately circular, ‘as-if’ description of any system that manages to not fall apart.

6 Greetings, Fellow Calculating Machines

Now we can see how these pieces fit together. A system uses energy to run computations that implement an information life cycle, and the pressure for energy efficiency forces those computations to become predictive.

David Wolpert and Kyle Harper formalize this at the societal scale with their “Multiple Communicating Machines” (MCM) model (David H. Wolpert and Harper 2025). They treat a society as a computer where occupations and technologies are interacting computational units. A society’s “computational power” is its ability to process information about its environment to effectively extract energy. This creates the central feedback loop of deep history: harvesting more energy allows a society to support more complex computation (more specialized jobs), which in turn allows it to harvest energy more effectively. This is the superorganism’s version of active inference: using its collective intelligence to model its world and act upon that model to ensure its continued existence.

So in this view, the agricultural revolution was a new set of algorithms for manipulating ecosystems. The rise of the state was a new computational architecture for managing large populations. And as Blaise Agüera y Arcas and James Manyika’s article argues, AI represents a new substrate for computation that is fundamentally changing our world.

7 Rise of the Superorganism

Many folks, myself included, think it’s reasonable to examine societies as “crude superorganisms(Boyd and Richerson 1999). As human societies became larger and more interconnected, they began to function as integrated, goal-directed entities.

Michael Wong and Stuart Bartlett argue that civilizations built on this kind of superlinear scaling—where growth and energy demand accelerate—are on a trajectory towards “asymptotic burnout” (Wong and Bartlett 2022). The time between crises that require major innovation gets shorter until the rate of innovation can no longer keep up, leading to collapse. This is their solution to the Fermi Paradox.

Figure 2

They propose two outs: wise civilizations might undergo a homeostatic awakening, in which they use their advanced intelligence to recognize this destructive trajectory and consciously shift their priorities from unbounded growth to long-term stability. Or they might roll the dice and become a Type III galaxy-spanning expansionist civilisation, but we suspect that this rarely succeeds, or we would see more of them

The growth-based singularity also amusingly similar to an idea proposed by Master’s supervisor, who found a similar one by curve-fitting economic data (Johansen and Sornette 2001). YMMV.

8 The Big Table

To clarify the differences, here is the feature matrix, (disclaimer: generated by AI)

Model / Framework What is “Intelligence”? Engine of Growth/Evolution Formalism/Methodology Scale/Domain
Still et al. (Thermo. of Prediction) Predictive power; minimizing non-predictive “nostalgic” information. Thermodynamic pressure for energetic efficiency. Statistical Physics (Nonequilibrium Thermodynamics) Universal (Any system with memory)
Friston’s Free Energy Principle Active Inference; minimizing variational free energy; embodying a predictive model. An existential imperative to resist dissipation and minimize surprise. Bayesian Statistics, Dynamical Systems Theory. Universal (purportedly from cells to societies)
England’s Dissipation-Driven Adaptation “Physical Adaptation”; structures exceptionally good at absorbing and dissipating energy from their environment. Thermodynamic pressure for systems to self-organize into states that increase entropy production. Causal Driver. The physical imperative to dissipate energy is what creates adaptive, life-like structures. Non-Equilibrium Statistical Mechanics (Fluctuation Theorems)
Chaisson’s Energy-Rate Density Energy flow per unit mass (\(\phi m\)) as a measure of complexity. Increasing capacity to capture and channel energy through a system. Empirical Metric (erg/s/g) Universal (Cosmos to Society)
Shin et al. (Seshat) Information-processing capacity (e.g., writing, bureaucracy, currency). Phased growth: scaling in size creates pressure for informational innovations. Statistical Analysis (PCA) of historical database. Holocene Societies
Wolpert & Harper’s MCM “Computational power” - the ability to process information to achieve goals. Co-evolutionary feedback loop between energy harvesting and computational capacity. Formal Computational Model (new class of automata). Universal (Cells to Societies)
Hagens’ Superorganism The emergent metabolic activity of the global economic system. Maximization of financial surplus, tethered to energy and carbon flows. Systems View (conceptual framework with empirical data). Global Human Civilization
Wong & Bartlett’s Burnout A civilization’s trajectory of accelerating growth (superlinear scaling). Positive feedback loops between population, innovation, and energy demand. Dynamical Systems Theory (conceptual, based on scaling laws). Planetary Civilizations
Boyd & Richerson’s Superorganism Cooperative functioning of large-scale societies. Gene-culture co-evolution creating social instincts, managed by cultural “work-arounds”. Evolutionary Theory (conceptual hypothesis). Human Societies

9 Understanding by building

Can we design open-ended intelligence?

Jeff Clunes has posed the question of “could we devise an open-ended exploratory algorithm that is worth running for a billion years?”

Is that what life is building?

(Clune 2020; Cully et al. 2015; Ecoffet et al. 2021; Faldor et al. 2024; Wang et al. 2019, 2020)

TBC

10 Biological anchors stuff

Ajeya Cotra tries to anchor modern synthetic intelligence by considering what nature has done, historically, in terms of compute.

11 Incoming

  • Deacon (2012):

    Incomplete Nature begins by accepting what other theories try to deny: that, although mental contents do indeed lack these material-energetic properties, they are still entirely products of physical processes and have an unprecedented kind of causal power that is unlike anything that physics and chemistry alone have so far explained. Paradoxically, it is the intrinsic incompleteness of these semiotic and teleological phenomena that is the source of their unique form of physical influence in the world. Incomplete Nature meticulously traces the emergence of this special causal capacity from simple thermodynamics to self-organizing dynamics to living and mental dynamics, and it demonstrates how specific absences (or constraints) play the critical causal role in the organization of physical processes that generate these properties.

  • PIBBSS – Principles of Intelligent Behavior in Biological and Social Systems

  • Prediction: Life will turn out to be everywhere (after a certain point)

  • The Second Law of Thermodynamics, and Engines of Cognition

  • Paths To High-Level Machine Intelligence

  • Blaise Agüera y Arcas and James Manyika: AI Is Evolving — And Changing Our Understanding Of Intelligence

    In this essay, we will describe five interrelated paradigm shifts informing our development of AI:

    1. Natural Computing — Computing existed in nature long before we built the first “artificial computers”. Understanding computing as a natural phenomenon will enable fundamental advances not only in computer science and AI but also in physics and biology.
    2. Neural Computing — Our brains are an exquisite instance of natural computing. Redesigning the computers that power AI so they work more like a brain will greatly increase AI’s energy efficiency—and its capabilities too.
    3. Predictive Intelligence — The success of large language models (LLMs) shows us something fundamental about the nature of intelligence: it involves statistical modeling of the future (including one’s own future actions) given evolving knowledge, observations and feedback from the past. This insight suggests that current distinctions between designing, training and running AI models are transitory; more sophisticated AI will evolve, grow and learn continuously and interactively, as we do.
    4. General Intelligence — Intelligence does not necessarily require biologically based computation. Although AI models will continue to improve, they are already broadly capable, tackling an increasing range of cognitive tasks with a skill level approaching and, in some cases, exceeding individual human capability. In this sense, “Artificial General Intelligence” (AGI) may already be here—we just keep shifting the goalposts.
    5. Collective Intelligence — Brains, AI agents and societies can all become more capable through increased scale. However, size alone is not enough. Intelligence is fundamentally social, powered by cooperation and the division of labor among many agents. In addition to causing us to rethink the nature of human (or “more than human”) intelligence, this insight suggests social aggregations of intelligences and multi-agent approaches to AI development that could reduce computational costs, increase AI heterogeneity and reframe AI safety debates.
  • Joscha Bach on Synthetic Intelligence / EA forum annotation

  • ecologies of minds considers the distinction between evolutionary and optimising minds.

  • What are we calling this? Factome, conceptome, empitome, noosphere…?

  • Ian Morris on whether deep history says we’re heading for an intelligence explosion

  • Deep atheism and AI risk - Joe Carlsmith

  • Do we need computers to create AIs at all, or are we all already AIs?

  • Paradigms of Intelligence Team

  • Human values & biases are inaccessible to the genome

  • Empowerment is (almost) All We Need

  • Publications - Center for the Study of Apparent Selves (CSAS)

12 References

Abramsky, Banzhaf, Caves, et al. 2025. Open Questions about Time and Self-Reference in Living Systems.”
Aktipis. 2016. Principles of Cooperation Across Systems: From Human Sharing to Multicellularity and Cancer.” Evolutionary Applications.
Arcas, Alakuijala, Evans, et al. 2024. Computational Life: How Well-Formed, Self-Replicating Programs Emerge from Simple Interaction.”
Axelrod, Robert M. 1984. The evolution of cooperation.
Axelrod, Robert, and Hamilton. 1981. The Evolution of Cooperation.” Science, New Series,.
Beaulieu, Frati, Miconi, et al. 2020. Learning to Continually Learn.”
Bergemann, and Morris. 2005. Robust Mechanism Design.” Econometrica.
Best, and Kellner. 1999. Kevin Kelly’s Complexity Theory: The Politics and Ideology of Self-Organizing Systems.” Organization & Environment.
Bowles, Choi, and Hopfensitz. 2003. The Co-Evolution of Individual Behaviors and Social Institutions.” Journal of Theoretical Biology.
Boyd, and Richerson. 1999. “Complex Societies: The Evolutionary Origins of a Crude Superorganism.” Human Nature.
Boyd, and Richerson. 2005. The Origin and Evolution of Cultures. Evolution and Cognition.
Chaisson. 2011. Energy Rate Density as a Complexity Metric and Evolutionary Driver.” Complexity.
Chapman, Childers, and Vallino. 2016. How the Second Law of Thermodynamics Has Informed Ecosystem Ecology Through Its History.” BioScience.
Clune. 2020. AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence.”
Cully, Clune, Tarapore, et al. 2015. Robots that can adapt like animals.” Nature.
Dafoe, Hughes, Bachrach, et al. 2020. Open Problems in Cooperative AI.”
Deacon. 2012. Incomplete Nature: How Mind Emerged from Matter.
Ecoffet, Huizinga, Lehman, et al. 2021. First Return, Then Explore.” Nature.
Egel. 2012. Life’s Order, Complexity, Organization, and Its Thermodynamic–Holistic Imperatives.” Life : Open Access Journal.
England. 2013. Statistical Physics of Self-Replication.” The Journal of Chemical Physics.
Faldor, Zhang, Cully, et al. 2024. OMNI-EPIC: Open-Endedness via Models of Human Notions of Interestingness with Environments Programmed in Code.” In.
Fletcher, and Zwick. 2007. The evolution of altruism: game theory in multilevel selection and inclusive fitness.” Journal of Theoretical Biology.
Friston. 2013. Life as We Know It.” Journal of The Royal Society Interface.
Galesic, Barkoczi, Berdahl, et al. 2022. Beyond Collective Intelligence: Collective Adaptation.”
Hagens. 2020. Economics for the Future – Beyond the Superorganism.” Ecological Economics.
Harari. 2018. Homo Deus: A Brief History of Tomorrow.
Hetzer, and Sornette. 2009. Other-Regarding Preferences and Altruistic Punishment: A Darwinian Perspective.” SSRN Scholarly Paper ID 1468517.
Hoffman, and Prakash. 2014. Objects of consciousness.” Frontiers in Psychology.
Holmström. 1979. Moral Hazard and Observability.” The Bell Journal of Economics.
Johansen, and Sornette. 2001. Finite-Time Singularity in the Dynamics of the World Population, Economic and Financial Indices.” Physica A: Statistical Mechanics and Its Applications.
Kauffman, Stuart A. 1993. The Origins of Order: Self-Organization and Selection in Evolution.
Kauffman, Stuart A. 1996. At Home in the Universe: The Search for the Laws of Self-Organization and Complexity.
Laffont, and Martimort. 2002. The Theory of Incentives: The Principal-Agent Model.
Lang, Fisher, Mora, et al. 2014. Thermodynamics of Statistical Inference by Cells.” Physical Review Letters.
Levin. 2024. Artificial Intelligences: A Bridge Toward Diverse Intelligence and Humanity’s Future.” Advanced Intelligent Systems.
Marsland, and England. 2018. Limits of Predictions in Thermodynamic Systems: A Review.” Reports on Progress in Physics.
Mas-Colell, Whinston, and Green. 1995. Microeconomic Theory.
Maskin. 1999. Nash Equilibrium and Welfare Optimality.” The Review of Economic Studies.
Mercier, and Sperber. 2011. Why Do Humans Reason? Arguments for an Argumentative Theory.” Behavioral and Brain Sciences.
———. 2017. The Enigma of Reason.
Mesoudi, and Whiten. 2008. The Multiple Roles of Cultural Transmission Experiments in Understanding Human Cultural Evolution.” Philosophical Transactions of the Royal Society B: Biological Sciences.
Meulemans, Kobayashi, Oswald, et al. 2024. Multi-Agent Cooperation Through Learning-Aware Policy Gradients.” In.
Morris. 2014. The Measure of Civilization: How Social Development Decides the Fate of Nations.
Muthukrishna. 2023. A Theory of Everyone: The New Science of Who We Are, How We Got Here, and Where We’re Going.
Myerson. 1981. Optimal Auction Design.” Mathematics of Operations Research.
Nowak. 2006. Five Rules for the Evolution of Cooperation.” Science.
Omohundro. 2008. The Basic AI Drives.” In Proceedings of the 2008 Conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference.
Perunov, Marsland, and England. 2016. Statistical Physics of Adaptation.” Physical Review X.
Prakash, Stephens, Hoffman, et al. 2021. Fitness Beats Truth in the Evolution of Perception.” Acta Biotheoretica.
Ringstrom. 2022. Reward Is Not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning.”
Schneider, and Kay. 1994. “Life as a Manifestation of the Second Law of Thermodynamics.” Mathematical and Computer Modelling.
Shin, Price, Wolpert, et al. 2020. Scale and Information-Processing Thresholds in Holocene Social Evolution.” Nature Communications.
Sornette. 2003. Critical Market Crashes.” Physics Reports.
Still, Sivak, Bell, et al. 2012. Thermodynamics of Prediction.” Physical Review Letters.
Suki. 2012. The Major Transitions of Life from a Network Perspective.” Frontiers in Physiology.
Thagard. 1997. “Collaborative Knowledge.” Noûs.
Wang, Lehman, Clune, et al. 2019. POET: Open-Ended Coevolution of Environments and Their Optimized Solutions.” In Proceedings of the Genetic and Evolutionary Computation Conference. GECCO ’19.
Wang, Lehman, Rawal, et al. 2020. Enhanced POET: Open-Ended Reinforcement Learning Through Unbounded Invention of Learning Challenges and Their Solutions.”
Wolpert, David H. 2006. Information Theory — The Bridge Connecting Bounded Rational Game Theory and Statistical Physics.” In Complex Engineered Systems. Understanding Complex Systems.
Wolpert, David H. 2008. Physical Limits of Inference.” Physica D: Nonlinear Phenomena, Novel Computing Paradigms: Quo Vadis?,.
Wolpert, David. 2017. Constraints on Physical Reality Arising from a Formalization of Knowledge.”
Wolpert, David H. 2018. Theories of Knowledge and Theories of Everything.” In The Map and the Territory: Exploring the Foundations of Science, Thought and Reality.
———. 2019. Stochastic Thermodynamics of Computation.”
Wolpert, David H, Bieniawski, and Rajnarayan. 2011. “Probability Collectives in Optimization.”
Wolpert, David H., and Harper. 2025. The Computational Power of a Human Society: A New Model of Social Evolution.”
Wolpert, David H., and Tumer. 1999. An Introduction to Collective Intelligence.” arXiv:cs/9908014.
Wong, and Bartlett. 2022. Asymptotic Burnout and Homeostatic Awakening: A Possible Solution to the Fermi Paradox? Journal of The Royal Society Interface.