ILIAD2
Oddesty
2025-02-22 — 2025-09-04
Wherein the Bay Area unconference is recorded, a neural‑network analogue of the computational no‑coincidence conjecture is outlined, and a phase transition in singular learning theory is noted.
ILIAD2 is an unconference about diverse mainstream and left-field approaches to technical AI Safety in the SF Bay Area.
Much happened when I attended in 2025, and I haven’t digested it all yet. What follows is my highlight list. It’s somewhat scattered.
1 Singular learning theory
The breakaway theme of the conference was the Singular Learning Theory work, which went through a phase transition: what started, in my opinion, as a set of suggestive results that suggestively resemble an approach to AI safety, became results that actually look like they might be useful for non‑trivial things. Colour me surprised. There’s too much to summarize about that here, but it might be appropriate if someone deeper into it has a go.
2 Textbook from the future
I mention it both because it’s a cool new project and because it aspires to serve as an introduction to the whole AI Safety field.
The metaphor I usually use is that if a textbook from one hundred years in the future fell into our hands, containing all of the simple ideas that actually work robustly in practice, we could probably build an aligned super‑intelligence in six months. — Eliezer Yudkowsky
They’re attempting to write it. See here.
4 Rosas and Boyd, AI in a vat via bisimulation
AI in a vat: Fundamental limits of efficient world modelling for safe agent sandboxing (Rosas, Boyd, and Baltieri 2025). Connecting world models, simulation hypotheses, etc.
5 Oli Richardson’s probabilistic dependency graphs
Cool research on a generalized graphical model family. The author gave it a good pitch:
This stuff was an extremely interesting way of approaching inconsistency as a kind of generalized inferential target, producing some classic losses and model structures as special cases to persuade us that it’s natural. Very tasty work.
This thesis develops a broad theory of how to approach probabilistic modeling with possibly-inconsistent information, unifying and reframing much of the literature in the process. The key ingredient is a novel kind of graphical model, called a Probabilistic Dependency Graph (PDG), which allows for arbitrary (even conflicting) pieces of probabilistic information. In Part I, we establish PDGs as a generalization of other models of mental state, including traditional graphical models such as Bayesian Networks and Factor Graphs, as well as causal models, and even generalizations of probability distributions, such as Dempster-Shafer Belief functions. In Part II, we show that PDGs also capture modern neural representations. Surprisingly, standard loss functions can be viewed as the inconsistency of a PDG that models the situation appropriately. Furthermore, many important algorithms in AI are instances of a simple approach to resolving inconsistencies. In Part III, we provide algorithms for PDG inference, and uncover a deep algorithmic equivalence between the problems of inference and calculating a PDG’s numerical degree of inconsistency.
He described it compactly as constraining an inference algorithm to be consistent (in some sense) with respect to beliefs rather than utilities, which is a desirable property.
6 Daniel Herrmann and Aydin Mohseni on whether causal inference is even a thing
TBD; I lost my notes. But they argued that something like causal inference — in the sense of intervention inference — does not “require” the do operator, but can rather be constructed as standard conditionalization in an expanded graph. Except it ends up being computationally cheaper to use the do operator in practice.
It feels like something very deep is going on here.
7 Daniel Herrmann on principal-agent problems
When is it rational to outsource to an agent e.g. an AI agent? It’s an interesting slice of the alignment pie.
8 Adam Goldstein from Softmax on enlightened machines
I don’t know what to make of this yet. He made an argument from human developmental psychology that training bots on the entirety of the internet implicitly trains little psychopaths, which we can only understand as objects of control because we cannot imagine them as co-subjects. Sounds bad? But I only saw the initial presentation, which wasn’t very quantitative or detailed, so I can’t speak to it. There were some follow-on presentations that may have filled in the necessary details.
9 Julian Gough on cosmic natural selection
Anthropic principles via natural selection upon black holes!
10 Linkdump
Things I learned or people I met but didn’t have time to make better notes about.
Greatest game of the conference: Person Do Thing, introduced to me by Daniel Herrmann and Aydin Mohseni
Artemy Kolchinsky did some amazing tricks with the information bottleneck.
Post-AGI Civilizational Equilibria Workshop | Vancouver 2025
Anarchy as Architect: Competitive Pressure, Technology, and the Internal Structure of States
-
The PIBBSS summer research fellowship is designed for researchers from various fields, mostly studying complex and intelligent behavior in natural and social systems but also those studying mathematics, philosophy or engineering, who are motivated by the mission of making AI systems safe and beneficial.
During the program, fellows work on selected projects at the intersection between their field of expertise and AI safety. Fellows will work in close collaboration with a mentor who will help them effectively navigate the AI Risk landscape and apply their knowledge to it.
The program is centrally aimed at Ph.D. or Postdoctoral researchers, however, we encourage interested individuals with comparable research experience in their field of expertise to apply regardless of their credential
-
- What are the new technologies and challenges that AI could unlock? Which will come first?
- What can companies and governments do to avoid extreme power concentrations?
- Which beneficial applications of AI should be accelerated? How can we do that?
- How do we reach really good futures (rather than “just” avoiding catastrophe)?
11 Official sound track
The official sound track of ILIAD2 was General Fuzz. Shout out to Headphone James.
12 Proceedings reviews
I reviewed exactly one paper; it was about computational no-coincidence conjectures.

3 Social Choice Theory: alignment as a Maximal Lottery
Roberto-Rafael Maura-Riverso explained:
Next-token choice as a collective social choice problem? Sign me up!
Roberto-Rafael Maura-Riverso introduced Maximal Lotteries, the only (Brandl and Brandt 2020) stochastic lottery with certain nice properties that are desirable in an AI context (read the paper for which, I forget).
Things I learned:
RLHF is a “Borda count” vote (!) (Siththaranjan, Laidlaw, and Hadfield-Menell 2023), which performs poorly as a voting mechanism with respect to Condorcet outcomes.
Cf. “Nash learning from human feedback” (NashLHF), the “best” (democratically speaking) feedback system (Munos et al. 2024).
More info at Lanctot et al. (2025), Maura-Rivero, Lanctot, et al. (2025), Maura-Rivero, Nagpal, et al. (2025).