Singular Learning Theory

October 29, 2024 — October 30, 2024

dynamical systems
machine learning
neural nets
physics
pseudorandomness
sciml
statistics
statmech
stochastic processes
Figure 1

Placeholder.

Jesse Hoogland, Neural networks generalize because of this one weird trick:

Statistical learning theory is lying to you: “overparametrized” models actually aren’t overparametrized, and generalization is not just a question of broad basins.

1 Local Learning Coefficient

Recommended to me by Rohan Hitchcock:

2 Incoming

3 References

Carroll. 2021. “Phase Transitions in Neural Networks.”
Farrugia-Roberts, Murfet, and Geard. 2022. “Structural Degeneracy in Neural Networks.”
Lau, Furman, Wang, et al. 2024. The Local Learning Coefficient: A Singularity-Aware Complexity Measure.”
Lin. 2011. Algebraic Methods for Evaluating Integrals in Bayesian Statistics.”
Wang, Hoogland, Wingerden, et al. 2024. Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient.”
Watanabe. 2009. Algebraic Geometry and Statistical Learning Theory. Cambridge Monographs on Applied and Computational Mathematics.
———. 2020. Mathematical theory of Bayesian statistics.
Wei, and Lau. 2023. Variational Bayesian Neural Networks via Resolution of Singularities.” Journal of Computational and Graphical Statistics.
Wei, Murfet, Gong, et al. 2023. Deep Learning Is Singular, and That’s Good.” IEEE Transactions on Neural Networks and Learning Systems.