# Information geometry

October 21, 2011 — December 27, 2019

functional analysis

geometry

networks

statistics

A placeholder for a particular type of learning on curved spaces about which I do not in fact know anything.

Maybe see Azimuth’s Information Geometry Series plus the overview

## 1 References

Amari, Shunʼichi. 1987. “Differential Geometrical Theory of Statistics.” In

*Differential Geometry in Statistical Inference*.
Amari, Shun-ichi. 1998. “Natural Gradient Works Efficiently in Learning.”

*Neural Computation*.
Amari, Shunʼichi. 2001. “Information Geometry on Hierarchy of Probability Distributions.”

*IEEE Transactions on Information Theory*.
Amari, Shun-ichi, Karakida, and Oizumi. 2018. “Fisher Information and Natural Gradient Learning of Random Deep Networks.”

*arXiv:1808.07172 [Cond-Mat, Stat]*.
Amari, Shun-ichi, Park, and Fukumizu. 2000. “Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons.”

*Neural Computation*.
Ay. 2002. “An Information-Geometric Approach to a Theory of Pragmatic Structuring.”

*The Annals of Probability*.
Barndorff-Nielsen. 1987. “Differential and Integral Geometry in Statistical Inference.” In

*Differential Geometry in Statistical Inference*.
Brody, and Rivier. 1995. “Geometrical Aspects of Statistical Mechanics.”

*Phys. Rev. E*.
Caticha. 2015. “The Basics of Information Geometry.” In.

Csiszár. 1975. “I-Divergence Geometry of Probability Distributions and Minimization Problems.”

*The Annals of Probability*.
Doyle, and Steiner. 2011. “Commuting Time Geometry of Ergodic Markov Chains.”

Fernández-Martínez, Fernández-Muñiz, Pallero, et al. 2013. “From Bayes to Tarantola: New Insights to Understand Uncertainty in Inverse Problems.”

*Journal of Applied Geophysics*.
Kass, Amari, Arai, et al. 2018. “Computational Neuroscience: Mathematical and Statistical Perspectives.”

*Annual Review of Statistics and Its Application*.
Khan, and Zhang. 2022. “When Optimal Transport Meets Information Geometry.”

*Information Geometry*.
Kulhavý. 1990. “Recursive Nonlinear Estimation: A Geometric Approach.”

*Automatica*.
Lauritzen. 1987. “Statistical Manifolds.” In

*Differential Geometry in Statistical Inference*.
Ly, Marsman, Verhagen, et al. 2017. “A Tutorial on Fisher Information.”

*Journal of Mathematical Psychology*.
Mallasto, Gerolin, and Minh. 2021. “Entropy-Regularized 2-Wasserstein Distance Between Gaussian Measures.”

*Information Geometry*.
Martens. 2020. “New Insights and Perspectives on the Natural Gradient Method.”

*Journal of Machine Learning Research*.
Miolane, Mathe, Donnat, et al. 2018. “Geomstats: A Python Package for Riemannian Geometry in Machine Learning.”

*arXiv:1805.08308 [Cs, Stat]*.
Mosegaard, and Tarantola. 1995. “Monte Carlo Sampling of Solutions to Inverse Problems.”

*Journal of Geophysical Research: Solid Earth*.
———. 2002. “Probabilistic Approach to Inverse Problems.” In

*International Geophysics*.
Nielsen. 2018. “An Elementary Introduction to Information Geometry.”

*arXiv:1808.08271 [Cs, Math, Stat]*.
Palomar, and Verdu. 2008. “Lautum Information.”

*IEEE Transactions on Information Theory*.
Poole, Lahiri, Raghu, et al. 2016. “Exponential Expressivity in Deep Neural Networks Through Transient Chaos.” In

*Advances in Neural Information Processing Systems 29*.
Raginsky, and Sason. 2012. “Concentration of Measure Inequalities in Information Theory, Communications and Coding.”

*Foundations and Trends in Communications and Information Theory*.
Transtrum, Machta, and Sethna. 2011. “The Geometry of Nonlinear Least Squares with Applications to Sloppy Models and Optimization.”

*Physical Review E*.