Singular Learning Theory
October 29, 2024 — March 23, 2025
Placeholder.
As far as I can tell, a first-order approximation to (the bits I vaguely understand of) Singular Learning Theory is something like:
Classical Bayesian statistics has a good theory of well-posed models with a small number of interpretable parameters. Singular Learning Theory is a theory of ill-posed models with a large number of uninterpretable parameters, which provides us with a model of Bayesian statistics by using results from algebraic geometry about singularities in the loss surface.
There are obviously a lot of details missing from that. I think there are non-Bayesian versions too, but I haven’t been exposed to them yet.
Jesse Hoogland, Neural networks generalize because of this one weird trick:
Statistical learning theory is lying to you: “overparametrized” models actually aren’t overparametrized, and generalization is not just a question of broad basins.
1 Local Learning Coefficient
Recommended to me by Rohan Hitchcock:
- You’re Measuring Model Complexity Wrong explains Lau et al. (2024)’s_ local learning coefficient_.
- The RLCT Measures the Effective Dimension of Neural Networks
2 Incoming
See also @Watanabe (2022).
Alexander Gietelink Oldenziel, Singular Learning Theory
metauni’s Singular Learning Theory seminar
-
Timaeus is an AI safety research organisation working on applications of Singular Learning Theory (SLT) to alignment.