Agency under bounded compute and information

Stuck between the good regulator and the big world

2026-04-29 — 2026-04-29

adaptive
agents
AI safety
bounded compute
compsci
economics
extended self
machine learning
mind
statmech
when to compute
Figure 1

Here’s where I wish to start, for context: If we have complete information about the initial conditions of the universe, and unlimited compute, then it is not natural to classify anything as possessing “agency”; we can see the whole of the universe, up to quantum fluctuations, by stepping our perfect simulator forward in time. There are no decisions to be made, merely consequences to be observed.

But we might also find that thinking of ourselves as horologists to the clockwork universe is unsatisfactory as a model for us doing the modelling. I will never be this omniscient being. Knowing all of this universe would need me to have another, much larger, universe to store the information in, to build the computer to do the simulation, and so on. I mean maybe that happened and I am the simulation of some omniscient agent? But that omniscient agent is not a great model for me.

AIXI relaxes the first assumption (we don’t know where the universe started, we just arrived in it with an effectively infinite computer and need to work out what it is). That still doesn’t feel like enough of a relaxation. More generally, “agency” seems like it might accrue to entities which need to learn to act with both bounded information and bounded compute. Let us explore some ideas.

The two axes are information about the world (complete vs bounded) and compute available (unlimited vs bounded).

NoteAdjacent run-ups

One of several notebooks I started on the same underlying problem. The others are homunculi (compute split across self, other, and reflective sub-models) and economics of cognition (compute as a substitutable factor of production). That I keep failing to merge them is, I think, telling me something about the shape of the question.

1 The good regulator in the big world

The internal model principle, AKA the good regulator theorem, says: any agent that does well in an environment must contain a model of that environment.1 So the agent contains a model.

The big-world hypothesis says: the environment is strictly bigger than the agent — there is no fixed point at which the agent has “solved” the world. (A formal version is the Lewandowski et al. setup below — an agent embedded as a finite automaton inside a universal computer, interacting with a POMDP over a countably infinite state space.)

Together, these suggest that we need to think about lossy compression of the world, by construction. What kind of compression is the agent’s internal model? Which features survive? What is it sufficient for? When does the residual look like noise, and when does it look like the world ambushing us?

Somewhere in here there is a bridge between internal models and bounded rationality. The internal model principle says we need a model. Bounded rationality says we cannot afford the naïve one, the territory-as-map. The technical question is what we can afford that still works.

2 Prior art

Short notes on each. Future-me, do not be alarmed by the count — these are mostly variations on a single optimisation problem, viewed from different traditions.

Entry Operates on What it does Where bound enters
Resource-rational analysis cognitive strategies picks the optimum soft cost on cognitive operations
Bounded optimality programs for architecture \(M\) picks the optimum hard feasibility
Rational metareasoning computations, online picks the next one to run per-step VOC cost
Information-theoretic bounded rationality policies picks the KL-constrained optimum KL budget vs prior
Bounded inductive rationality logical beliefs over time characterizes calibrated paths no-arbitrage in the limit
Computationally tractable choice decision rules rules out intractable ones polynomial-time axiom
POMDP-as-agent system interpretations characterizes POMDP-shaped ones state-space capacity
Computationally-embedded big world agent automata in computable worlds picks interactivity-maintaining ones agent fits inside the world

NARS is somehow philosophically nearby but is not formalised tightly enough to fit a row.

2.1 Resource-rational analysis

Lieder & Griffiths (Lieder and Griffiths 2020) frame human cognition as the optimal use of finite computational resources. The procedure: define the agent’s task, the cognitive operations available, and their costs; derive the heuristic that maximises expected utility under the budget. The heuristic that comes out is the prediction; observed cognitive biases are the right answer to the constrained problem.

LLM summary follows:

The optimisation has the same shape as the bounded-rationality entries above but at a different grain:

\[ \pi^* = \arg\max_{\pi \,\in\, \text{strategies}} \mathbb{E}\bigl[\,U(\pi) - \mathrm{cost}(\pi)\,\bigr], \]

where \(\pi\) is a cognitive strategy — an algorithm parameterised by how much compute it spends — and \(\mathrm{cost}(\pi)\) is a cost on cognitive operations (search expansions, posterior samples, working-memory loads). This sits between BO and RM in granularity: BO optimises whole programs offline, RM optimises individual computations online, and RRA optimises the heuristic itself — a strategy with an internal compute knob — treating it as a unit.

The interpretive move that makes RRA distinctive in the cognitive-science direction is reading observed biases as \(\pi^*\) for some plausible cost model, not as failures of rationality. Anchoring-style adjustment and availability-based estimation have been derived in this framework as resource-rational under specific costs on sample count and inference depth. The inferential arrow runs in the unusual direction: from observed heuristic, infer the brain’s implicit cost structure.

What it gives us: an explicit machinery for “best affordable model”. If the world is bigger than the agent, this is the framework that tells us which compression to keep.

2.2 Bounded optimality

Russell & Subramanian (S. J. Russell and Subramanian 1995) define a program as bounded-optimal for a given architecture if no other program for that architecture does better in the environment. The optimisation moves from “best action” to “best algorithm-on-the-hardware”. Resource-rational analysis is its descendant in the cognitive-science direction; Gershman, Horvitz & Tenenbaum’s computational rationality (Gershman, Horvitz, and Tenenbaum 2015) is the umbrella name.

LLM summary follows:

Fix an architecture \(M\) — a machine model with bounded memory and per-step compute, plus a clock. Bounded optimality is the program \(\pi^*\) that maximises expected utility subject to running on \(M\):

\[ \pi^* = \arg\max_{\pi \,\in\, \mathrm{programs}(M)} \mathbb{E}\bigl[U \mid \pi,\,\text{environment}\bigr]. \]

So the agent is no longer choosing actions; it is choosing the algorithm that will, in turn, choose actions on its hardware. This is a meta-level optimization over policies-as-programs, with the architecture’s resource constraints baked into the feasible set rather than tacked on as a soft penalty — which is what distinguishes BO from the KL-regularized, free-energy framing above.

Russell & Subramanian also propose asymptotic bounded optimality (ABO), a weakening that requires the program to approach optimality only as the architecture is scaled up, because strict per-architecture optimality is itself intractable to analyse. cf large sample theory — Infinite asymptotics are more tractable to analyse. ABO is a softer cousin of AIXI’s compute-to-infinity move — when “best on \(M\)” is too hard to characterise, the analysis retreats to “best as \(M\) grows”. Sticking to bounded compute seems to be really hard, based on what smart people do when they try to stick to it.

What it gives us: an insistence that the budget is a property of the implementation, not a vague hand-wave. Future me, if I am tempted to talk about “the agent’s budget” without specifying the architecture, see above.

2.3 Rational metareasoning

S. Russell and Wefald (1991) model each unit of computation as a decision: expected value of running the computation minus its cost.2 The agent decides which inferences to run as a planning problem one level up.

LLM summary follows:

Two questions distinguish RM from the BO entry above (also a Russell programme): what is being optimised and when. BO optimises programs, offline — a designer picks the best program for a fixed architecture, once, and the program then just runs. RM optimises computation choices, online — at runtime the agent itself decides, step by step, which inference to run next. The two framings are complementary rather than competing: RM is the technique a bounded-optimal program might use internally to allocate its compute, and BO is the criterion under which the program-plus-RM-policy bundle as a whole should be assessed.

The decision-theoretic content is the value of computation (VOC). Let \(b\) be the agent’s current belief state and \(c\) a candidate computation — expanding a node in a search tree, sampling another rollout, refining a posterior. Then

\[ \mathrm{VOC}(c \mid b) = \mathbb{E}\bigl[\,U \,\big|\, \text{act after } c,\, b\,\bigr] - \mathbb{E}\bigl[\,U \,\big|\, \text{act now},\, b\,\bigr] - \mathrm{cost}(c). \]

Run \(c\) if its VOC is positive; act otherwise. The Hay-Russell-Tolpin-Shimony meta-MDP turns this into a proper Markov decision process whose states are beliefs over the underlying MDP and whose actions include both external actions and computation actions, with optimal policies given by Bellman equations on the meta level.

Notice the regress: estimating \(\mathrm{VOC}(c)\) before running \(c\) is itself a computation, and deciding whether that is worth running is another. Practical RM truncates the recursion — usually at one level (myopic VOC, “is this single next step worth it?”), occasionally deeper. The truncation is what lets the framework run; whether the truncated VOC is itself bounded-optimal in any tight sense is the question that makes RM and BO need each other.

What it gives us: a decision-theoretically explicit story about when to compute, which is the question posed by the good-regulator-meets-big-world tension. The agent cannot run the full simulation; the metareasoner picks the slice.

2.4 Information-theoretic bounded rationality

Ortega & Braun (Pedro Alejandro Ortega and Braun 2011; Pedro A. Ortega and Braun 2013) formalise a bounded-rational policy as the KL-divergence-constrained optimum of expected utility.

LLM summary follows:

Pick a reference or prior policy \(\pi_0\) — what the agent does cheaply, without deliberating — and constrain the deliberated policy \(\pi\) to stay close to it in KL. So the budget is on the KL divergence, not on FLOPs, and the optimisation is

\[ \pi^* = \arg\max_\pi \;\mathbb{E}_\pi[U(a)] \quad\text{s.t.}\quad \mathrm{KL}(\pi \,\|\, \pi_0) \le B, \]

equivalently, with \(\beta\) the Lagrange multiplier on the constraint,

\[ \pi^* = \arg\max_\pi \;\mathbb{E}_\pi[U(a)] - \tfrac{1}{\beta}\,\mathrm{KL}(\pi \,\|\, \pi_0). \]

The thermodynamic flavour comes from reading \(\mathrm{KL}(\pi \,\|\, \pi_0)\) as the work done against the prior: every nat of divergence from \(\pi_0\) has to be extracted from somewhere, by the same Landauer-style accounting that prices erasure. That is the argument that this functional counts as a model of compute cost and not just a regularizer.

The functional that drops out is a free energy, \(F[\pi] = \mathbb{E}_\pi[U] - \tfrac{1}{\beta}\,\mathrm{KL}(\pi \,\|\, \pi_0)\), whose maximizer is the Boltzmann-tilted policy \(\pi^*(a) \propto \pi_0(a)\,\exp(\beta\,U(a))\). The convenient consequence is that we can sample from \(\pi^*\) by importance-weighting draws from \(\pi_0\) against \(\exp(\beta U)\) — no need to enumerate actions to find the argmax, because the optimum is an integral over \(\pi_0\) that a Monte Carlo planner approximates without bias.

What it gives us: a formal stand-in for “compute budget” inside decision theory. It is also the bridge to thermodynamics, since the same functional reappears in the substrate principle section. Hafner et al. (Hafner et al. 2022) cast action and perception as joint divergence minimisation in this lineage.

2.5 NARS / Assumption of Insufficient Knowledge and Resources

Pei Wang’s NARS programme defines intelligence as adaptation under insufficient knowledge and resources. Already covered in the economics of cognition notebook; flagging here because it is philosophically closest to the “intelligence is boundedness” argument. It is also vague enough that I cannot work out exactly what they mean.

2.6 Bounded inductive rationality

Oesterheld, Demski & Conitzer (Oesterheld, Demski, and Conitzer 2023) explicitly drop logical omniscience.

LLM summary follows:

Logical omniscience is the standard Bayesian assumption that an agent assigns probability 1 to every theorem and probability 0 to every contradiction the moment those are stated. Classical decision theory needs that assumption to make consequences of beliefs cash out: if a finite reasoner believes Peano arithmetic and PA proves \(\varphi\), then classically the reasoner already believes \(\varphi\). In any practical sense it does not, because nobody has run the proof. Until the agent spends compute to derive \(\varphi\), the sentence’s truth-value is just a probability strictly between 0 and 1 — and the agent needs a coherent way to reason with that probability while the proof is still in flight.

The machinery is the logical-induction stuff: a market of traders who bet on whether sentences will be proved, refuted, or remain undecided by a given deductive horizon, and the agent’s beliefs are the market prices. Coherence is not a constraint at any single moment but a no-arbitrage criterion over time — no efficiently-computable trader should be able to extract unbounded money from the agent’s prices as the deductive horizon expands. The agent ends up with calibrated logical beliefs in the limit without ever having been classically coherent on any particular day. ODC’s contribution over the precursor logical-induction work is a clean axiomatisation: the trader-market construction was the existence proof; ODC characterise the property class.

A consequence we will need later: the agent never has to commit to what it will prove. It has prices on its own future actions and inferences, which is the move that dissolves the Löbian / 5-and-10 / spurious-counterfactual traps in embedded agency. A coherent agent that is required to know now what it will do later runs into diagonal arguments; an inductively-rational agent answers “I will give you a probability, and update it as the proof comes in”.

This gives us a story close to the embedded agency problem, but still with bounded compute. This might be the most direct response to the Löbian-paradox stuff without leaving for AIXI’s infinity.

2.7 Computationally tractable choice

Camara (Camara 2021) adds an axiom of computational tractability to decision theory: rule out behaviours that are fundamentally hard.

LLM summary follows:

Camara messes with axioms. Take whatever set of decision-theory axioms you like — rationalizability, transitivity, independence — and add one more, that the agent’s choice function admits a polynomial-time implementation. Choice rules that fail this axiom are not admissible models of choosing agents.

So what does CTC rule out? Not agents — the framework takes for granted that agents exist and make choices. What it rules out are theories of choice. Camara identifies particular preference structures whose induced choice rules cannot be computed in polynomial time, and concludes that those structures cannot describe how any tractable computer chooses. If observed behaviour can only be rationalised by a super-polynomial decision rule, the verdict is that the rationalisation is the wrong account — the right move is to find a tractable rule that also fits the data, not to credit the agent with running an intractable algorithm in its head.

This sits differently from BO. Bounded optimality picks the best program for a given architecture; CTC rules decision theories out for any architecture at all. The two are complementary — BO tells us how to optimise inside the admissible class of theories; CTC tells us what the admissible class is.

What it gives us: a decision-theoretic analogue of bounded optimality, more axiomatic in flavour, easier to plug into economics-of-cognition arguments.

2.8 POMDP-as-agent

Biehl & Virgo (Biehl and Virgo 2023) interpret a system as a POMDP solver: the internal state of the system maps to a belief state about the outside world, and the system’s actions are optimal given that belief.

LLM summary follows:

A system \(S\) is interpretable as a POMDP solver if its internal states map to belief states and its outputs map to policy actions under some POMDP — i.e. its dynamics factor as Bayesian belief-update composed with a value-optimising policy. The interpretation need not be unique: a single physical system can admit multiple POMDP readings, each with its own implied utility, observation model, and hidden-world structure. The criterion is structural — does the system’s behaviour factor in a POMDP shape — without committing to which POMDP it solves.

This is the formal version of the good-regulator claim that a well-performing agent contains a model. Biehl & Virgo make “contains a model” a checkable explicit property, and they do it without first committing to a utility function — the system is generically agenty, not necessarily a particular agent. Bounded compute enters as a restriction on which POMDPs admit interpretations: a finite system can only realise belief-updates and policies whose state spaces fit its capacity.

What it gives us: a candidate formal substrate. The good regulator theorem and bounded rationality both translate into statements about which POMDP solutions a finite agent can realise.

2.9 Computationally-embedded big world

Lewandowski et al. (Lewandowski et al. 2025) is the spine of the tension stated above. An automaton inside a universal computer is implicitly constrained — there is no need to bolt the bound on. Their proposed objective, interactivity, measures the agent’s continuing ability to adapt and learn new predictions. Empirically, deep nonlinear networks struggle to maintain interactivity; deep linear networks do better as capacity grows. I expect this paper to be load-bearing for our agenda.

LLM summary follows:

The agent is a finite-state automaton with bounded memory and per-step compute. The world is the state of a universal computer evolving under some computable rule. They exchange symbols across an interface. The agent’s boundedness is a consequence of its automaton fitting inside the world’s automaton, not a separately-imposed budget — we are invited to consider this “agent inside a bigger computer” framing, maybe relevant to boundaries and multi-level agency and all that.

Interactivity, their proposed objective, measures the agent’s continuing capacity to acquire useful new predictions about the world’s dynamics as time goes on. This is the Big-World twist: there is no terminal “I have learned the environment” state, and the right objective is sustained learnability rather than convergence to a stationary optimum. RL’s usual maximise-and-stop framing is the wrong shape for an agent in this regime.

They even have an empirical finding! Expressive nonlinear networks, despite their nominal capacity advantage, lose interactivity as they scale — their representations collapse onto fixed features and they stop generating new predictions. Linear networks scale more gracefully on the same metric. Architectural choices that look like “more capability” under maximise-and-stop may be exactly the wrong choices for an agent that has to keep learning. Whatever the bounded agency cell ends up requiring, “scale up the most expressive network you can afford” is not it.

3 God help us, predictive coding

The Friston-style free-energy / predictive-coding tradition is in this neighbourhood and I want to mark it as such without inheriting its commitments. Active inference is, viewed from outside, a particular package of internal-model + KL-constrained policy + thermodynamic substrate; the ingredients line up with what we are assembling here. From the outside it looks like a natural approach, but observationally, I think something must be wrong with the definitions or framing or something because no one can agree what this theory is “about”. I am hopeful that I will accidentally rediscover the “good bits” of the free-energy principle by proceeding without it.

4 Substrate principle, Cantor trap

The thought experiment in the opening — give the demon complete information and unlimited compute and let it simulate the universe — articulates a weird bias in the literature that I had not found words for before. We ignore that the demon needs somewhere to put the simulation.

Wolpert’s stochastic thermodynamics of computation (David H. Wolpert 2019; David H. Wolpert and Korbel 2026) is one attempt to price the substrate. Any logical operation has a thermodynamic cost; any physical computer is itself a dynamical system whose computation we identify by mapping its dynamics onto an abstract machine. A simulator of the entire universe is, at minimum, a physical system as complex as the universe being simulated — the exact lower bound depends on the dynamics, but it is not zero. So a Laplacean demon needs a bigger universe to live in. There is no infinite-compute thought experiment that is internally coherent without dragging in another, larger, world.

Call this the substrate principle. Any agent — including the one we use to motivate the unbounded-compute limit — is a physical system competing for the same resources as the world it models. Bounded-information, bounded compute is not one option of four; it is the only cell that exists. The other three are interesting idealisations — projections of bounded agents in which we let things grow to infinity. They are not realisable, and moreover it is not clear to me that we ever get close to any of them that they are even a good approximation.

The bias in the embedded agency and broader agent foundations literature is what I like to call a Cantor trap: treating an unrealisable combination of compute or data as a place where one can stand and reason, and then “approximating down” from it. The trap looks like progress because Cantorian objects — countably-infinite hypothesis classes, ω-limits, transfinite hierarchies of agents — are mathematically clean and aesthetically seductive (Hilbert’s paradise from which no one shall expel us!). In practice the infinite limit seems to be a vanishing point. The diagonalisation puzzles flagged in bounded inductive rationality — Löb, 5-and-10, spurious counterfactuals — are symptoms of the same trap: what happens when one tries to make an agent’s reasoning span its own infinite compute. It is worth being aware of the dangers of this trap. Infinities are cool, ngl, but we need to know how informative they are about finite agents before we use ‘em.

A particular annoyance: my embedded agency notebook was my attempt to understand careful, technically polished work on reflective stability, logical counterfactuals over the agent’s own actions, and value-stability under self-modification — exactly the problems the Cantor trap generates. From the bounded agents perspective, the same questions look different: some dissolve, others survive but in milder versions, about capacity and prices on uncertainty rather than impossible self-knowledge. AFAICT a tradition with this much technical care has produced surprisingly little to help us reason about agents that are “like us” in this important way.

Solé et al. (2024) line up adjacent constraints from the biology side — what living systems can and cannot be, given their thermodynamic and informational budgets. Useful as a sanity check that the substrate principle shows up empirically in the systems we already call agents.

Future-me, have you read all these papers yet? One tempting next step is a specific lower bound on a model’s complexity in terms of the agent’s thermodynamic budget. Is that done?

5 Where next

Chatbot suggests:

  1. Read Lewandowski et al. (2025) and Baltieri et al. (2025) carefully and write the diff. Both are about embedded agents containing models; one is information-theoretic and asymptotic, the other categorical and structural. There may be a tidy correspondence.
  2. Pick one toy problem — a satisficing LQR, or a tree-search agent with anytime stopping — and run the resource-rational, bounded-optimal, and KL-constrained derivations side-by-side. Verify that they tell the same story, or notice where they diverge.
  3. Promote the substrate principle (and the Cantor trap with it) to a standalone post; the ingredients above are enough for one.
  4. Sketch what a “good regulator theorem under a thermodynamic budget” would look like, even at one page. Surely this must be done? I have not read All the Things yet.

6 Incoming

  • Tishby & Polani’s information theory of the perception–action cycle (Tishby and Polani 2011) — close to Ortega-Braun, but starts from the cycle.
  • Ho et al. on simplified mental representations for planning (Ho et al. 2022) — empirical evidence that humans plan with a deliberately impoverished model.
  • Gigerenzer & Goldstein on fast-and-frugal heuristics (Gigerenzer and Goldstein 1996) — anchor for the descriptive bounded-rationality literature based on humans.

7 References

Baltieri, Biehl, Capucci, et al. 2025. A Bayesian Interpretation of the Internal Model Principle.”
Biehl, and Virgo. 2023. Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency.” In.
Camara. 2021. “Computationally Tractable Choice.”
Conant, and Ashby. 1970. Every Good Regulator of a System Must Be a Model of That System.” International Journal of Systems Science.
Francis, and Wonham. 1976. The Internal Model Principle of Control Theory.” Automatica.
Gershman, Horvitz, and Tenenbaum. 2015. Computational Rationality: A Converging Paradigm for Intelligence in Brains, Minds, and Machines.” Science.
Gigerenzer, and Goldstein. 1996. Reasoning the Fast and Frugal Way: Models of Bounded Rationality.” Psychological Review.
Hafner, Ortega, Ba, et al. 2022. Action and Perception as Divergence Minimization.”
Hay, Russell, Tolpin, et al. 2012. Selecting Computations: Theory and Applications.” In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence.
Ho, Abel, Correa, et al. 2022. People Construct Simplified Mental Representations to Plan.” Nature.
Huang, Isidori, Marconi, et al. 2018. Internal Models in Control, Biology and Neuroscience.” In 2018 IEEE Conference on Decision and Control (CDC).
Lewandowski, Ramesh, Meyer, et al. 2025. The World Is Bigger: A Computationally-Embedded Perspective on the Big World Hypothesis.” In.
Lieder, and Griffiths. 2020. Resource-Rational Analysis: Understanding Human Cognition as the Optimal Use of Limited Computational Resources.” Behavioral and Brain Sciences.
Oesterheld, Demski, and Conitzer. 2023. A Theory of Bounded Inductive Rationality.” Electronic Proceedings in Theoretical Computer Science.
Ortega, Pedro Alejandro, and Braun. 2011. Information, Utility and Bounded Rationality.” In Proceedings of the 4th International Conference on Artificial General Intelligence. AGI’11.
Ortega, Pedro A., and Braun. 2013. Thermodynamics as a Theory of Decision-Making with Information-Processing Costs.” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
Russell, S. J., and Subramanian. 1995. Provably Bounded-Optimal Agents.” Journal of Artificial Intelligence Research.
Russell, Stuart, and Wefald. 1991. Principles of Metareasoning.” Artificial Intelligence.
Solé, Kempes, Corominas-Murtra, et al. 2024. Fundamental Constraints to the Logic of Living Systems.” Interface Focus.
Tishby, and Polani. 2011. “Information Theory of Decisions and Actions.” In PERCEPTION-ACTION CYCLE.
Wolpert, David H. 2019. Stochastic Thermodynamics of Computation.”
Wolpert, David H, and Korbel. 2026. What Does It Mean for a System to Compute? Journal of Physics: Complexity.

Footnotes

  1. Conant & Ashby (Conant and Ashby 1970) is the original cybernetics flavour; Francis & Wonham (Francis and Wonham 1976) gives the control-theoretic version; Huang et al. (Huang et al. 2018) is a recent tutorial; Baltieri, Biehl, Capucci & Virgo (Baltieri et al. 2025) is a Bayesian / categorical reformulation with fewer assumptions. Longer notes in internal model principles.↩︎

  2. Briefly summarised in the economics of cognition notebook under “rational metareasoning”. The Hay et al. (2012) meta-MDP is the operationalisation.↩︎