Empowerment
A particular mathematization of intrinsic motivation
2022-11-27 — 2026-04-09
Wherein Empowerment Is Formalised as the Mutual Information Between Actions and Future States, and Its Bearing Upon Power-Seeking in Intelligent Agents Is Considered.
The drive to empowerment is a hypothesized, generic, internal pressure on an agent to move itself into states from which it has lots of influence (or many options) going forward. If an agent “seeks empowerment”, it ‘aims’ to maximize its ability to affect the future. We could weaponise this concept in the 21st century AI agency age, where it helps us ask questions like: is “power-seeking” a generic property of intelligent agents? Also, it looks a little like a formalisation for robots and abstract algorithms of the notion of having fun, or some other intrinsic motivation
As with many such intuitions, it’s harder to usefully formalize.
Technical empowerment is tightly defined: it’s a precise, information-theoretic quantity from optimal control theory, specifically the mutual information between actions and future states. Borrowing the POMDP notation — sans-serif for random variables, \(\mathsf{a}_t \in \mathcal{A}\) for actions and \(\mathsf{s}_t \in \mathcal{S}\) for states — the \(n\)-step empowerment at state \(s\) is the channel capacity from an action sequence \(\mathsf{a}_t^n = (\mathsf{a}_t, \mathsf{a}_{t+1}, \ldots, \mathsf{a}_{t+n-1})\) to the future state \(\mathsf{s}_{t+n}\):
\[ \mathfrak{E}_n(s) = \max_{p(\mathsf{a}_t^n)} I\!\left(\mathsf{a}_t^n;\, \mathsf{s}_{t+n} \,\middle|\, \mathsf{s}_t = s\right), \tag{1}\]
where the maximum runs over distributions on action sequences and the mutual information is taken under the environment dynamics \(T(s' \mid s, a)\). In rough terms it quantifies how many distinguishable futures the agent could reliably reach, given its action choices and the dynamics of the environment.
Empowerment is sometimes called a “pseudo-utility” (i.e. a kind of internal reward proxy) that depends only on local, agent‐accessible information. If we squint at it, we can imagine that it quantifies how much the agent can “control” its future, in a way that’s independent of any particular extrinsic reward or task. Of course, in making that claim we assumes a lot of structure: a well-defined agent, environment, state space, action space, transition dynamics, etc, wrapped up in a POMDP or similar formalism.
It looks like empowerment in this special formalisation captures a general tendency of agents to keep options open and extend their influence. The argument (e.g. in Empowerment Is (Almost) All We Need) is that a sufficiently powerful drive for empowerment might lead to many of the behaviours we desire in intelligent agents — exploration, maintaining options, robustness, die Wille zur Macht etc.
In each case, the concept is less about a fixed external goal or reward (like “get the treasure”) — rather, the agent is driven by how much control it has over its future, regardless of the extrinsic objective.
1 Technical empowerment in reinforcement learning
In RL, agents typically optimize for maximal yield from an external reward signal. Many environments have sparse, noisy, or delayed reward signals, which makes learning hard (the exploration problem, credit assignment, etc.).
Empowerment sidesteps some of these pathologies by providing:
- Because it doesn’t depend on an external task, an agent can explore more “safely” or systematically by trying to increase its control.
- It biases the agent towards states with many outgoing branches — places from which many futures are reachable. In many domains, that corresponds to being in central, flexible positions rather than being stuck in a dead end.
- Some works combine extrinsic reward with empowerment. For example, there is a formulation of a unified Bellman equation that mixes reward maximization with empowerment terms (Leibfried, Pascual-Diaz, and Grau-Moya 2020).
- Others use empowerment or information‐theoretic objectives as intrinsic motivations to guide exploration, especially in sparse‐reward tasks. (Dai et al. 2021)
- More recently, works have integrated causal modelling with empowerment to get better sample efficiency and more directed exploration. For instance, “Empowerment via Causal Learning” is a framework in model‐based RL that uses causal structure to compute empowerment more meaningfully. (Cao, Feng, Fang, et al. 2025)
- There is also a notion called causal action empowerment, which aims to focus the empowerment signal on those actions that causally influence important parts of the environment (Cao, Feng, Huo, et al. 2025). I really need to read that one.
This all sounds cool, but clearly most people do no use empowerment as their RL objective. AFAICS this is because it creates more problems than it solves.
It is easy to say that empowerment is “easier” than long horizon reward computation. But computing empowerment is in general expensive. Estimating information is generally hard, for one thing, and it gets worse in high-dimensional, continuous, or partially observable settings. Further, getting information about future states requires modelling the environment dynamics. So it’s the usual RL problems, but with a spiky, tricky estimand. There are various approximation methods (Zhao et al. 2020).
Moreover, I know of no optimality results for empowerment, and it looks like it is not terribly efficient.
I also have questions about how it is used in practice. Is it purely a training objective? What is the off-policy/on-policy story? How far to we roll out? Do we estimate an empowerment value function etc?
2 Empowerment in replicators
Let us keep the technical definition of empowerment in mind, but also consider the more metaphorical idea of an agent trying to keep many options open and maintain control over its future.
2.1 Replicators, vehicles, and influence
In evolutionary biology, a replicator (in the Dawkins/Hull sense) is an entity that is replicated with variation and subject to selection pressures (Godfrey-Smith 2000). A gene, informally, “wants” to maximize its propagation potential. It evolves strategies (via the organism) to influence its environment (via phenotype, behaviour, niche construction, etc.). Of course, genes cannot literally compute technical empowerment. But might what they optimize for be something like it?
- A replicator benefits from having many possible viable futures — i.e. flexibility in ecological or developmental trajectories such that it can survive under various conditions.
- The vehicle/organism is the means by which the replicator acts on the environment to preserve or replicate itself.
Just as an AI agent might try to keep many future branches open, a replicator — through its phenotypic machinery — might favour designs that maintain options in changing environments. That might look like having “many instances”, i.e. a large future population, although even as I think that through I am unconvinced about this metaphor. What roll-outs can a gene do?
Anyway, arguments suggesting replicators might seek such goals are awkwardly called selection theorems.
2.2 Empowerment-like structure in evolution
Here are some speculative bridges to empowerment:
- Robustness & evolvability: replication systems that can tolerate perturbations (robustness) and adapt (evolvability) are more “powerful” in the face of environmental change. That’s a kind of biological counterpart to having many controllable futures. We might draw a connection to evolutionary biology, based on the robustness and evolvability idea (Wagner 2005) from evolutionary theory.
- Niche construction / environment modification: many organisms modify the environment (e.g. beaver dams, root systems altering soil, microbial communities altering chemistry). These are ways replicators/vehicles shape the environment to increase the control space or favourable pathways they can exploit.
- Redundancy, backup paths, modularity: mechanisms like gene duplication, redundant pathways, or modular designs allow alternative routes of adaptation or survival when parts fail. That’s akin to an agent having multiple “paths” to future states.
- Selection in fluctuating environments: when environments change, replicators that maintain flexible strategies (i.e. not overly specialized) may outperform those with narrow optima. That aligns with the push toward “keeping many options open.”
All these ideas seem not crazy per-se but I am not convinced insofar as it is unclear to me how genes could “know” how many options they have kept open. Some kind of shadow of the future seems to be necessary for selection to favour empowerment-like traits, but it’s not clear how that would work mechanistically.
3 AI Agents and power-seeking
Empowerment suggests a route toward open-endedness: systems whose internal drive is to expand their “influence frontier” could drive themselves toward complexity and diversity, or ability to affect the world in more and more ways.
In the AI safety literature, this latter tendency is called power-seeking. The concern is that almost regardless of their outer goals, capable agents will instrumentally pursue strategies that give them more control over their environment, preserve their own functioning, and prevent others from shutting them down, because for multi task agents, having more influence over the world is generally useful for achieving a wide range of goals. The term of art for this is instrumental convergence (Omohundro 2018): many different goals, once pursued by sufficiently capable agents, lead to the same kinds of instrumental strategies — acquiring resources, defending against threats, preserving optionality, and extending influence.
As with the evolutionary version, we face the problem that this all happens in an open-ended world. How do we formalize and measure empowerment in such a setting? Technical empowerment gives us some footholds, but once we are in multi-agent, evolving, or unbounded environments, it becomes much harder to define what “future influence” even means, let alone hope it cashes out in a nice equation that we can compute.
Classic reading on this theme:
jacob\_cannell: Empowerment is (almost) All We Need on A. S. Klyubin, Polani, and Nehaniv (2005)- Joe Carlsmith, When should we worry about AI power-seeking?
4 Incoming
Salge, Glackin, and Polani (2014)
Combine with extrinsic tasks
- See how empowerment helps in sparse reward environments or as an exploration bonus.
- Read “A Unified Bellman Principle Combining Reward Maximization and Empowerment” for one approach. (Leibfried, Pascual-Diaz, and Grau-Moya 2020)
Scaling & approximation
- Formally, look into techniques for approximating empowerment in high-dimensional or continuous spaces (variational approximations, estimating mutual information, etc.) (Zhao et al. 2020).
Causal / model-based enhancements:
- In the technical sense, explore recent work that adds causal modelling to compute empowerment more meaningfully (i.e. what actions truly affect what variables) (Cao, Feng, Fang, et al. 2025).
Connections to open-endedness, evolution, artificial life
- Metaphorically, study how intrinsic drives (like empowerment) can support open-ended growth or autonomous innovation.
- Read in artificial life / evolutionary robotics about self-replication, niche construction, and the pressures toward controllability and adaptability (Taylor and Dorin 2020).
Critiques and safety considerations
- In the agent foundations sense, investigate failure modes: e.g. empowerment-driven agents might prefer “safe control” regions over risky but useful ones.
- Analyse whether empowerment aligns with human values or task goals, and whether it can be perverted.
