Causality, agency, decisions, learning
Causal inference involving agents, decisions, and learning
2018-10-23 — 2026-05-05
In Which Influence Diagrams Are Employed as Extensions of Bayesian Networks to Represent Decision Problems, With Multi-Agent Systems and Causal Attribution Also Considered.
Notes on decision theory and causality in which agents make decisions, especially in the context of AI safety.
For bonus points, we might consider multiple agents.
This is something I’m actively trying to understand better.
There is some mysterious causality juju in foundation models and other neural nets. This suggests to me that we should think hard about it as we move into the age of AI.
As far as I can tell, reasoning about intelligent systems with causality requires some extensions to vanilla causality, because intelligent systems can reason about the outcomes they wish to achieve, which makes things complicated and occasionally weird.
TBC.
1 Causality with feedback
A thermostat is a basic example of causality under feedback; see causality under feedback.
2 Influence diagrams
A recent, useful introduction is Everitt et al. (2021), so let’s follow it.
set of random variables \(\boldsymbol{V}\) with joint distribution \(\operatorname{Pr}(\boldsymbol{V})\) is a directed acyclic graph (DAG) \(\mathcal{G}=(\boldsymbol{V}, \mathcal{E})\) with vertices \(\boldsymbol{V}\) and edges \(\mathcal{E}\) such that the joint distribution can be factorised as \(\operatorname{Pr}(\boldsymbol{V})=\prod_{V \in \boldsymbol{v}} \operatorname{Pr}\left(V \mid \boldsymbol{Pa}_V\right)\), where \(\boldsymbol{Pa}_{\boldsymbol{V}}\) are the parents of \(V\) in \(G\).
This much should be familiar from causal DAGs.
There’s an extension to classic Bayesian networks called influence diagrams, which generalize Bayesian networks to represent decision problems, using “square nodes for decision variables, diamond nodes for utility variables, and round nodes for everything else.” In contrast to classic influence diagrams, the Everitt et al. (2021) variant (Causal Influence Models) carries probability distributions over the decision variables.
3 Mechanization
The other extension we want is mechanization: pair every object-level variable \(V\) with a mechanism variable \(M_V\) encoding how \(V\) is computed from its parents (Kenton et al. 2023). Mechanism nodes become first-class, so we can talk about edges between mechanisms — an agent reasoning about how its objective depends on the world’s mechanism, or a predictor reading an agent’s policy (MacDermott, Everitt, and Belardinelli 2023).
The combination — mechanized influence diagrams — is the formalism that lets us think about decisions where the agent’s policy is itself a node in the graph. Worked through in Decision theory in mechanized causal graphs.
4 Mechanized Multi-agent DAGs
We can extend mechanized influence diagrams to include many agents deciding about each other. See causality with multiple agents.
5 Identifying agency
What even is agency? How do we recognize it in natural and artificial systems? What are the implications for control, economics, and technology?
Discovering Agents (Kenton et al. 2023; MacDermott et al. 2024) takes an empirical look at the question of agency by examining, AFAICT, what counts as a deciding node in a mechanized causal graph.
6 Causal attribution and blameworthiness
I should write more about this — a connection to computational morality. Everitt et al. (2022) and Joseph Y. Halpern and Kleiman-Weiner (2018) are relevant works in this domain.
7 Causal, Evidential, Updateless decision theory
Newcomb’s problem might be amenable to decomposition by causal methods — specifically mechanized causal methods, since the structural feature of Newcomb is “the predictor reads the agent’s policy”, and that’s the kind of dependence mechanism edges express. Worked through in Decision theory in mechanized causal graphs.
8 Tooling
Fox and coauthors wrote a library for computing with various interesting causal influence diagrams, causalincentives/pycid (Fox et al. 2021):
Library for graphical models of decision making, based on pgmpy and networkx
