Decision theory in mechanized causal graphs
Newcomb’s box via the back door
2026-05-05 — 2026-05-06
In which mechanism nodes are rendered first-class in causal graphs, permitting edges between mechanisms, and Newcomb’s problem is illuminated by routing a predictor’s dependence through the agent’s policy.
Causality, agency, decisions, learning is the motivation for this notebook. There we hand-waved through mechanized causal graphs.
Here we go deep and explore the formalism introduced in Everitt et al. (2021) and Everitt et al. (2022). The point of the notation is that mechanism nodes are first-class, so we can talk about edges between mechanisms — e.g. an agent reasoning about how its objective depends on the world’s mechanism, or a predictor reading the agent’s policy (MacDermott, Everitt, and Belardinelli 2023).
By the end of the notebook we should be able to use MacDermott, Everitt, and Belardinelli (2023) to illuminate cosmic decision theory. In much the same way that I think standard DAGs have a role in uncovering some hidden assumptions in ML, I think there are some hidden assumptions in cosmic decision theories that mechanized causal graphs will help us uncover.
But let us see.
Later, we may even be able to use Kenton et al. (2023) to discover agents.
1 Mechanizing causal graphs
A mechanized causal graph augments a causal DAG with a mechanism variable (also called a policy) for each object-level node — encoding how that node’s value is generated from its parents (Kenton et al. 2023).
Layered on top is the influence-diagram convention from (Everitt et al. 2021): rectangles for decisions, diamonds for utilities, ovals/circles for everything else.
What are we doing when we claim that some nodes are special “mechanism” nodes? AFAICT formalising the intentional stance.
LaTeX tutorial (actually a TiKz tutorial.)
| Symbol | Style | Meaning |
|---|---|---|
| ◯ | obj |
object-level chance variable |
| ● | mech |
mechanism variable |
| ▭ | dec |
decision |
| ◇ | util |
utility |
→ solid |
causal |
causal edge between object-level variables, or functional edge from mechanism to its variable |
→ dashed |
mech-edge |
non-terminal mechanism dependency |
→ dashdotted |
objective |
terminal mechanism edge (an agent’s objective) |
2 Object-level DAG
The starting point — a vanilla causal DAG with no mechanism layer.
Each object variable acquires a mechanism node above it. The dashed edge from \(M_X\) to \(X\) is functional, not causal: it says “\(X\)’s value is computed by the mechanism \(M_X\) from its parents.” The solid edges between object-level nodes are the original causal structure.
3 Cosmic decision theories
The mechanized view earns its keep on problems where the agent’s policy is itself an input to some other variable. In Newcomb’s problem the predictor \(P\) forecasts the agent’s decision \(D\) — but \(P\) is causally upstream of \(D\), so the dependence cannot be captured by an object-level edge from \(D\) to \(P\). The mechanized graph routes it through \(M_D\), the agent’s decision mechanism (its policy): \(P\) depends on \(M_D\), and \(M_D\) functionally determines \(D\). The dashdotted edge is the literature’s objective/terminal-mechanism edge style — used loosely here to flag the “back door” through which the predictor accesses the agent.
The intended reading: prediction \(P\) determines box contents \(V\); the agent’s mechanism \(M_D\) implements a policy that produces a decision \(D\); utility \(U\) depends on both the contents and the choice; and the “Newcomb-ness” lives in the dash-dotted edge \(M_D \to P\) — the predictor sees the policy, not the act.
4 See also
- Causality, agency, decisions, learning — the narrative version of all this.
- Causality with feedback — thermostats and friends.
- Newcomb-style decision problems — what to do when the world reads your policy.