Predictive coding, Free Energy in the sense of Friston

Should a model that our brains do Bayesian variational prediction update any priors about our brains?

2011-11-27 — 2024-09-22

Wherein the free‑energy account of predictive coding is laid out, its variational Bayesian form is sketched, and the dark‑room paradox plus computable message‑passing implementations are noted.

AI safety

energy

learning

mind

neuron

probability

statistics

statmech

To learn: Is this what the “information-dynamics” folks are wondering about also, e.g. Ay et al. (2008) or Tishby and Polani (2011)? Perhaps the overview of different brain models in Neural Annealing: Toward a Neural Theory of Everything will place it in context.

There is some interesting along the lines of understanding biological learning as machine learning, e.g. Predictive Coding has been Unified with Backpropagation, summarises Millidge, Tschantz, and Buckley (2020b).

If we think there are multiple learning algorithms, do we find ourselves in a multi agent self situation?

Maybe related (?) prediction processes.

1 Lay intros

Scott Alexander reviews Clark (2015)
Confirmation Bias in Action

My goal with this post is to reconceptualize confirmation bias using the paradigm of predictive processing, with particular focus on the centrality of action. This focuses CB as a phenomenon of human epistemology, with our brains designed to manage our bodies and move them to manipulate the world. This view of CB wouldn’t apply to, for example, a lonely oracle AI passively monitoring the world and making predictions.
The Limits of the Free Energy Principle: A Systematic Critique of the Theoretical Foundations of Friston’s Predictive Processing
Jordana Cepelewicz, In Brain Waves, Scientists See Neurons Juggle Possible Futures
Lisa Feldman Barrett, Simplistic ‘Fight or Flight’ Idea Undervalues the Brain’s Predictive Powers

2 “Free energy principle”

The terminology here is baffling. The chief pusher of this wheelbarrow appears to be Karl Friston (e.g. K. Friston 2010, 2013; Williams 2020). He starts his Nature Reviews Neuroscience piece with this statement of the principle:

The free-energy principle says that any self-organising system that is at equilibrium with its environment must minimize its free energy.

Is that “must” in the sense that it is a

moral obligation, or
a testable conservation law of some kind?

If the latter, self-organising in what sense? What type of equilibrium? For which definition of the free energy? What is our chief experimental evidence for this hypothesis?

It could mean that, any right-thinking brain, seeking to avoid the vice of slothful and decadent perception after the manner of foreigners and compulsive masturbators, would do well to seek to maximize its free energy before partaking of a stimulating and refreshing physical recreation, such as a game of cricket.

Perhaps this version is better:

The free energy principle (FEP) claims that self-organization in biological agents is driven by variational free energy (FE) minimization in a generative probabilistic model of the agent’s environment.

K. Friston (2010) continues with a definition of free energy itself, via a diagram, which

…shows the dependencies among the quantities that define free energy. These include the internal states of the brain \(\mu(t)\) and quantities describing its exchange with the environment: sensory signals (and their motion) \(\bar{s}(t) = [s,s',s''…]^T\) plus action \(a(t)\). The environment is described by equations of motion, which specify the trajectory of its hidden states. The causes \(\vartheta \supset {\bar{x}, \theta, \gamma }\) of sensory input comprise hidden states \(\bar{x} (t),\) parameters \(\theta\), and precisions \(\gamma\) controlling the amplitude of the random fluctuations \(\bar{z}(t)\) and \(\bar{w}(t)\). Internal brain states and action minimize free energy \(F(\bar{s}, \mu)\), which is a function of sensory input and a probabilistic representation \(q(\vartheta|\mu)\) of its causes. This representation is called the recognition density and is encoded by internal states \(\mu\).

The free energy depends on two probability densities: the recognition density \(q(\vartheta|\mu)\) and one that generates sensory samples and their causes, \(p(\bar{s},\vartheta|m)\). The latter represents a probabilistic generative model (denoted by \(m\)), the form of which is entailed by the agent or brain…

\[F = -<\ln p(\bar{s},\vartheta|m)>_q + -<\ln q(\vartheta|\mu)>_q\]

This is (minus the actions) the variational free energy in Bayesian inference.

OK, so self-organising systems must improve their variational approximations to posterior beliefs? What is the contentful prediction of this meta-model?

3 World models

See world models.

4 Testable hypotheses

Could predictive coding generate some testable hypotheses such that we can count it as science? Here are some phenomena that look like they must relate.

Evolution of perception
addiction?
trauma?
depression?
How Self-Sabotage Saves You From Anxiety
Stress and Serotonin
Trapped Priors As A Basic Problem Of Rationality. 🚧TODO🚧 clarify NB I think this phenomenon is interesting but I wish he did not use sloppy Bayesian terminology in this case; it sounds like he is talking about an excessively tight prior, but the dynamics of this case diverge in substantive ways from a direct Bayesian update with an excessively tight prior. One would need some more complicated structure to explain the observation, such as a hierarchical model incorporating observation reliability, or an action-observation loop. For more on that, see trapped beliefs
Is the “Better the devil you know” problem a predictive coding problem?

4.1 Dark room problem

Basically: why do anything at all? Why not just live by self-fulfilling prophecies that guarantee good predictions, but ensuring that essentially nothing happens that requires non-trivial predictions (K. Friston, Thornton, and Clark 2012)?

Question: is this a perspective on depression?

5 Computable

ForneyLab (Akbayrak, Bocharov, and de Vries 2021) is a variational message passing library for probabilistic graphical models with an eye to computing it in a biologically plausible way and building predictive coding models.

ForneyLab is designed with a focus on flexibility, extensibility and applicability to biologically plausible models for perception and decision making, such as the hierarchical Gaussian filter (HGF). With ForneyLab, the search for better models for perception and action can be accelerated

6 Incoming

What does all this say about classical psychological therapies? e.g. Acceptance and commitment therapy.

7 References

Abramsky, Banzhaf, Caves, et al. 2025. “Open Questions about Time and Self-Reference in Living Systems.”

Aguilera, Millidge, Tschantz, et al. 2021. “How Particular Is the Physics of the Free Energy Principle?” arXiv:2105.11203 [q-Bio].

Akbayrak, Bocharov, and de Vries. 2021. “Extended Variational Message Passing for Automated Approximate Bayesian Inference.” Entropy.

Ay, Bertschinger, Der, et al. 2008. “Predictive Information and Explorative Behavior of Autonomous Robots.” The European Physical Journal B - Condensed Matter and Complex Systems.

Bishop. 2021. “Artificial Intelligence Is Stupid and Causal Reasoning Will Not Fix It.” Frontiers in Psychology.

Bridgland, Jones, and Bellet. 2023. “A Meta-Analysis of the Efficacy of Trigger Warnings, Content Warnings, and Content Notes.” Clinical Psychological Science.

Broockman, and Kalla. 2020. “When and Why Are Campaigns’ Persuasive Effects Small? Evidence from the 2020 US Presidential Election.”

Cao, Pastukhov, Aleshin, et al. n.d. “Binocular Rivalry Reveals an Out-of-Equilibrium Neural Dynamics Suited for Decision-Making.” eLife.

Carhart-Harris, and Nutt. 2017. “Serotonin and Brain Function: A Tale of Two Receptors.” Journal of Psychopharmacology.

Clark. 2015. Surfing Uncertainty: Prediction, Action, and the Embodied Mind.

Conant, and Ashby. 1970. “Every Good Regulator of a System Must Be a Model of That System.” International Journal of Systems Science.

Cox, van de Laar, and de Vries. 2019. “A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning.

Crutchfield, and Jurgens. 2025. “Agentic Information Theory: Ergodicity and Intrinsic Semantics of Information Processes.”

Da Costa, Friston, Heins, et al. 2021. “Bayesian Mechanics for Stationary Processes.” arXiv:2106.13830 [Math-Ph, Physics:nlin, q-Bio].

Du, Kosoy, Dayan, et al. 2023. “What Can AI Learn from Human Exploration? Intrinsically-Motivated Humans and Agents in Open-World Exploration.” In.

Fields, and Levin. 2019. “Somatic Multicellularity as a Satisficing Solution to the Prediction-Error Minimization Problem.” Communicative & Integrative Biology.

Francis, and Wonham. 1976. “The Internal Model Principle of Control Theory.” Automatica.

Friston, Karl. 2010. “The Free-Energy Principle: A Unified Brain Theory?” Nature Reviews Neuroscience.

———. 2013. “Life as We Know It.” Journal of The Royal Society Interface.

Friston, Karl J., Parr, and de Vries. 2017. “The Graphical Brain: Belief Propagation and Active Inference.” Network Neuroscience.

Friston, Karl J, Ramstead, Kiefer, et al. 2024. “Designing Ecosystems of Intelligence from First Principles.” Collective Intelligence.

Friston, Karl, Thornton, and Clark. 2012. “Free-Energy Minimization and the Dark-Room Problem.” Frontiers in Psychology.

Glymour. 2007. “When Is a Brain Like the Planet?” Philosophy of Science.

Griffiths, Thomas L., Chater, and Tenenbaum. 2024. Bayesian Models of Cognition: Reverse Engineering the Mind.

Griffiths, Thomas L, Kemp, and Tenenbaum. n.d. “Bayesian Models of Cognition.”

Hafner, Ortega, Ba, et al. 2022. “Action and Perception as Divergence Minimization.”

Heald, Lengyel, and Wolpert. 2021. “Contextual Inference Underlies the Learning of Sensorimotor Repertoires.” Nature.

Ho, Abel, Correa, et al. 2022. “People Construct Simplified Mental Representations to Plan.” Nature.

Hyland, Gavenčiak, Costa, et al. 2024. “Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents.” In.

Kay, Chung, Sosa, et al. 2020. “Constant Sub-second Cycling between Representations of Possible Futures in the Hippocampus.” Cell.

Lidayan, Du, Kosoy, et al. 2025. “Intrinsically-Motivated Humans and Agents in Open-World Exploration.”

Ma, Kording, and Goldreich. 2022. Bayesian Models of Perception and Action.

Mathys, Daunizeau, Friston, et al. 2011. “A Bayesian Foundation for Individual Learning Under Uncertainty.” Frontiers in Human Neuroscience.

Miller Tate. 2021. “A Predictive Processing Theory of Motivation.” Synthese.

Millidge, Tschantz, and Buckley. 2020a. “Whence the Expected Free Energy?” arXiv:2004.08128 [Cs].

———. 2020b. “Predictive Coding Approximates Backprop Along Arbitrary Computation Graphs.” arXiv:2006.04182 [Cs].

O’Connor, Wellisch, Stanton, et al. 2008. “Craving Love? Enduring Grief Activates Brain’s Reward Center.” NeuroImage.

Ororbia, and Mali. 2023. “The Predictive Forward-Forward Algorithm.”

Parr, Markovic, Kiebel, et al. 2019. “Neuronal Message Passing Using Mean-Field, Bethe, and Marginal Approximations.” Scientific Reports.

Parr, Pezzulo, and Friston. 2022. Active Inference: The Free Energy Principle in Mind, Brain, and Behavior.

Piekarski. 2021. “Understanding Predictive Processing. A Review.” AVANT. Pismo Awangardy Filozoficzno-Naukowej.

Porr, and Miller. 2020. “Forward Propagation Closed Loop Learning.” Adaptive Behavior.

Ramírez-Ruiz, Grytskyy, Mastrogiuseppe, et al. 2024. “Complex Behavior from Intrinsic Motivation to Occupy Future Action-State Path Space.” Nature Communications.

Ramstead, Sakthivadivel, Heins, et al. 2023. “On Bayesian Mechanics: A Physics of and by Beliefs.” Interface Focus.

Ringstrom. 2022. “Reward Is Not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning.”

Sakthivadivel. 2022. “Towards a Geometry and Analysis for Bayesian Mechanics.”

Schmidhuber. 2010. “Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010).” IEEE Transactions on Autonomous Mental Development.

Shepard. 2004. “How a Cognitive Psychologist Came to Seek Universal Laws.” Psychonomic Bulletin & Review.

Still. 2020. “Thermodynamic Cost and Benefit of Memory.” Physical Review Letters.

Tarsney. 2025. “Will Artificial Agents Pursue Power by Default?”

Tishby, Pereira, and Bialek. 2000. “The Information Bottleneck Method.”

Tishby, and Polani. 2011. “Information Theory of Decisions and Actions.” In PERCEPTION-ACTION CYCLE.

Turner, Smith, Shah, et al. 2021. “Optimal Policies Tend To Seek Power.” In Advances in Neural Information Processing Systems.

Laar, Thijs van de, Cox, Senoz, et al. 2018. “ForneyLab: A Toolbox for Biologically Plausible Free Energy Minimization in Dynamic Neural Models.” In Conference on Complex Systems.

Laar, Thijs W. van de, and de Vries. 2019. “Simulating Active Inference Processes by Message Passing.” Frontiers in Robotics and AI.

Laar, Thijs van de, Koudahl, and de Vries. 2023. “Realising Synthetic Active Inference Agents, Part II: Variational Message Updates.”

Laar, Thijs van de, Koudahl, van Erp, et al. 2022. “Active Inference and Epistemic Value in Graphical Models.” Frontiers in Robotics and AI.

Virgo, Biehl, Baltieri, et al. 2025. “A ‘Good Regulator Theorem’ for Embodied Agents.”

Williams. 2020. “Predictive Coding and Thought.” Synthese.