Causality, agency, decisions, learning

Causal inference involving agents, decisions, and learning

2018-10-23 — 2026-05-05

In Which Influence Diagrams Are Employed as Extensions of Bayesian Networks to Represent Decision Problems, With Multi-Agent Systems and Causal Attribution Also Considered.

adaptive
agents
causality
cooperation
extended self
game theory
graphical models
incentive mechanisms
learning
mind
networks
utility
Figure 1

Notes on decision theory and causality in which agents make decisions, especially in the context of AI safety.

For bonus points, we might consider multiple agents.

This is something I’m actively trying to understand better.

There is some mysterious causality juju in foundation models and other neural nets. This suggests to me that we should think hard about it as we move into the age of AI.

As far as I can tell, reasoning about intelligent systems with causality requires some extensions to vanilla causality, because intelligent systems can reason about the outcomes they wish to achieve, which makes things complicated and occasionally weird.

TBC.

1 Causality with feedback

A thermostat is a basic example of causality under feedback; see causality under feedback.

2 Influence diagrams

A recent, useful introduction is Everitt et al. (2021), so let’s follow it.

set of random variables \(\boldsymbol{V}\) with joint distribution \(\operatorname{Pr}(\boldsymbol{V})\) is a directed acyclic graph (DAG) \(\mathcal{G}=(\boldsymbol{V}, \mathcal{E})\) with vertices \(\boldsymbol{V}\) and edges \(\mathcal{E}\) such that the joint distribution can be factorised as \(\operatorname{Pr}(\boldsymbol{V})=\prod_{V \in \boldsymbol{v}} \operatorname{Pr}\left(V \mid \boldsymbol{Pa}_V\right)\), where \(\boldsymbol{Pa}_{\boldsymbol{V}}\) are the parents of \(V\) in \(G\).

This much should be familiar from causal DAGs.

There’s an extension to classic Bayesian networks called influence diagrams, which generalize Bayesian networks to represent decision problems, using “square nodes for decision variables, diamond nodes for utility variables, and round nodes for everything else.” In contrast to classic influence diagrams, the Everitt et al. (2021) variant (Causal Influence Models) carries probability distributions over the decision variables.

3 Mechanization

The other extension we want is mechanization: pair every object-level variable \(V\) with a mechanism variable \(M_V\) encoding how \(V\) is computed from its parents (Kenton et al. 2023). Mechanism nodes become first-class, so we can talk about edges between mechanisms — an agent reasoning about how its objective depends on the world’s mechanism, or a predictor reading an agent’s policy (MacDermott, Everitt, and Belardinelli 2023).

The combination — mechanized influence diagrams — is the formalism that lets us think about decisions where the agent’s policy is itself a node in the graph. Worked through in Decision theory in mechanized causal graphs.

4 Mechanized Multi-agent DAGs

We can extend mechanized influence diagrams to include many agents deciding about each other. See causality with multiple agents.

5 Identifying agency

What even is agency? How do we recognize it in natural and artificial systems? What are the implications for control, economics, and technology?

Discovering Agents (Kenton et al. 2023; MacDermott et al. 2024) takes an empirical look at the question of agency by examining, AFAICT, what counts as a deciding node in a mechanized causal graph.

6 Causal attribution and blameworthiness

I should write more about this — a connection to computational morality. Everitt et al. (2022) and Joseph Y. Halpern and Kleiman-Weiner (2018) are relevant works in this domain.

7 Causal, Evidential, Updateless decision theory

Newcomb’s problem might be amenable to decomposition by causal methods — specifically mechanized causal methods, since the structural feature of Newcomb is “the predictor reads the agent’s policy”, and that’s the kind of dependence mechanism edges express. Worked through in Decision theory in mechanized causal graphs.

8 Tooling

Fox and coauthors wrote a library for computing with various interesting causal influence diagrams, causalincentives/pycid (Fox et al. 2021):

Library for graphical models of decision making, based on pgmpy and networkx

9 References

Ånestrand. 2024. Emergence of Agency from a Causal Perspective.”
Ashurst, Carey, Chiappa, et al. 2022. Why Fair Labels Can Yield Unfair Predictions: Graphical Conditions for Introduced Unfairness.”
Bell, Linsefors, Oesterheld, et al. 2021. Reinforcement Learning in Newcomblike Environments.” In Advances in Neural Information Processing Systems.
Benford. 2010. What Does Newcomb’s Paradox Teach Us?
Biehl, and Virgo. 2023. Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency.” In.
Bongers, Forré, Peters, et al. 2021. Foundations of Structural Causal Models with Cycles and Latent Variables.” The Annals of Statistics.
Caniglia, Murray, Hernán, et al. n.d. Estimating Optimal Dynamic Treatment Strategies Under Resource Constraints Using Dynamic Marginal Structural Models.” Statistics in Medicine.
Cao, Feng, Fang, et al. 2025. Towards Empowerment Gain Through Causal Structure Learning in Model-Based RL.”
Cao, Feng, Huo, et al. 2025. Causal Action Empowerment for Efficient Reinforcement Learning in Embodied Agents.” Science China Information Sciences.
Carey, Langlois, Merwijk, et al. 2025. Incentives for Responsiveness, Instrumental Control and Impact.”
Chiappa, and Isaac. 2019. A Causal Bayesian Networks Viewpoint on Fairness.” In Privacy and Identity Management. Fairness, Accountability, and Transparency in the Age of Big Data. IFIP Advances in Information and Communication Technology.
Chu, Rule, Goddu, et al. 2025. Fun Isn’t Easy: Children Selectively Manipulate Task Difficulty When “Playing for Fun” Versus “Playing to Win”.” Developmental Psychology.
Correa, and Bareinboim. 2020. A Calculus for Stochastic Interventions:Causal Effect Identification and Surrogate Experiments.” Proceedings of the AAAI Conference on Artificial Intelligence.
Dawid. 2002. Influence Diagrams for Causal Modelling and Inference.” International Statistical Review.
Everitt, Carey, Langlois, et al. 2021. Agent Incentives: A Causal Perspective.” In Proceedings of the AAAI Conference on Artificial Intelligence.
Everitt, Garbacea, Bellot, et al. 2025. Evaluating the Goal-Directedness of Large Language Models.”
Everitt, Ortega, Barnes, et al. 2022. Understanding Agent Incentives Using Causal Influence Diagrams. Part I: Single Action Settings.”
Fernández-Loría, and Provost. 2021. Causal Decision Making and Causal Effect Estimation Are Not the Same… and Why It Matters.”
Fox, Everitt, Carey, et al. 2021. PyCID: A Python Library for Causal Influence Diagrams.” In.
Fox, MacDermott, Hammond, et al. 2023. On Imperfect Recall in Multi-Agent Influence Diagrams.” Electronic Proceedings in Theoretical Computer Science.
Gans. 2018. Self-Regulating Artificial General Intelligence.”
Geiger, Ibeling, Zur, et al. 2024. Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability.”
Gopnik, Glymour, Sobel, et al. 2004. A Theory of Causal Learning in Children: Causal Maps and Bayes Nets. Psychological Review.
Hafner, Ortega, Ba, et al. 2022. Action and Perception as Divergence Minimization.”
Halpern, Joseph Y. 1998. “Axiomatizing Causal Reasoning.” In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. UAI’98.
Halpern, J. Y. 2000. Axiomatizing Causal Reasoning.” Journal of Artificial Intelligence Research.
Halpern, Joseph Y., and Kleiman-Weiner. 2018. Towards Formal Definitions of Blameworthiness, Intention, and Moral Responsibility.”
Halpern, Joseph Y., and Piermont. 2024. Subjective Causality.”
Hammond, Fox, Everitt, et al. 2023. Reasoning about Causality in Games.” Artificial Intelligence.
Heckerman, and Shachter. 1994. A Decision-Based View of Causality.” In Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence. UAI’94.
Herrmann, Mohseni, Levinstein, et al. 2026. “A Bayesian Reduction of Causation.”
Hoel. 2025. Causal Emergence 2.0: Quantifying Emergent Complexity.”
Howard, and Matheson. 2005. Influence Diagrams.” Decision Analysis.
Kekić, Schneider, Büchler, et al. 2025. Learning Nonlinear Causal Reductions to Explain Reinforcement Learning Policies.”
Kenton, Kumar, Farquhar, et al. 2023. Discovering Agents.” Artificial Intelligence.
Koller, and Milch. 2003. Multi-Agent Influence Diagrams for Representing and Solving Games.” Games and Economic Behavior, First World Congress of the Game Theory Society,.
Kulveit, Douglas, Ammann, et al. 2025. Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development.”
Lattimore. 2017. Learning How to Act: Making Good Decisions with Machine Learning.”
Liu, Wang, Li, et al. 2024. Attaining Human Desirable Outcomes in Human-AI Interaction via Structural Causal Games.”
Loewith, and Street. 2025. Mutual Prediction in Human–AI Coevolution.” Antikythera Digital Journal.
MacDermott, Everitt, and Belardinelli. 2023. Characterising Decision Theories with Mechanised Causal Graphs.”
MacDermott, Fox, Belardinelli, et al. 2024. Measuring Goal-Directedness.”
Meulemans, Schug, Kobayashi, et al. 2023. Would I Have Gotten That Reward? Long-Term Credit Assignment by Counterfactual Contribution Analysis.”
Mollo, and Millière. 2023. The Vector Grounding Problem.”
Orseau, McGill, and Legg. 2018. Agents and Devices: A Relative Definition of Agency.”
Richens, and Everitt. 2024. Robust Agents Learn Causal World Models.”
Schulte, and Poupart. 2024. Why Online Reinforcement Learning Is Causal.”
Virgo, Biehl, and McGregor. 2021. Interpreting Dynamical Systems as Bayesian Reasoners.”
Ward, Francis Rhys, MacDermott, Belardinelli, et al. 2024. The Reasons That Agents Act: Intention and Instrumental Goals.”
Ward, Francis, Toni, Belardinelli, et al. 2023. Honesty Is the Best Policy: Defining and Mitigating AI Deception.” In Advances in Neural Information Processing Systems.
Wen, Zhong, Khan, et al. 2024. Language Models Learn to Mislead Humans via RLHF.”
Wolpert, and Benford. 2013. The Lesson of Newcomb’s Paradox.” Synthese.
Yiu, Allen, Ginosar, et al. 2025. Empowerment Gain and Causal Model Construction: Children and Adults Are Sensitive to Controllability and Variability in Their Causal Interventions.”