Information geometry

A placeholder for a particular type of learning on curved spaces about which I do not in fact know anything.

Maybe see Azimuth’s Information Geometry Series plus the overview

Amari, Shun-ichi. 1998. “Natural Gradient Works Efficiently in Learning.” Neural Computation 10 (2): 251–76. https://doi.org/10.1162/089976698300017746.

Amari, Shun-ichi, Ryo Karakida, and Masafumi Oizumi. 2018. “Fisher Information and Natural Gradient Learning of Random Deep Networks,” August. http://arxiv.org/abs/1808.07172.

Amari, Shun-ichi, Hyeyoung Park, and Kenji Fukumizu. 2000. “Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons.” Neural Computation 12 (6): 1399–1409. https://doi.org/10.1162/089976600300015420.

Amari, Shunʼichi. 1987. “Differential Geometrical Theory of Statistics.” In Differential Geometry in Statistical Inference, 19–94.

———. 2001. “Information Geometry on Hierarchy of Probability Distributions.” IEEE Transactions on Information Theory 47: 1701–11. https://doi.org/10.1109/18.930911.

Ay, Nihat. 2002. “An Information-Geometric Approach to a Theory of Pragmatic Structuring.” The Annals of Probability 30 (1): 416–36. https://doi.org/10.1214/aop/1020107773.

Barndorff-Nielsen, O E. 1987. “Differential and Integral Geometry in Statistical Inference.” In Differential Geometry in Statistical Inference. Sn Aarhus.

Brody, Dorje, and Nicolas Rivier. 1995. “Geometrical Aspects of Statistical Mechanics.” Phys. Rev. E 51 (2): 1006–11. https://doi.org/10.1103/PhysRevE.51.1006.

Csiszár, I. 1975. “I-Divergence Geometry of Probability Distributions and Minimization Problems.” The Annals of Probability 3 (1): 146–58.

Doyle, Peter G, Gregory Leibon, and Jean Steiner. n.d. “Conformal Geometry of Markov Chains.”

Fernández-Martínez, J. L., Z. Fernández-Muñiz, J. L. G. Pallero, and L. M. Pedruelo-González. 2013. “From Bayes to Tarantola: New Insights to Understand Uncertainty in Inverse Problems.” Journal of Applied Geophysics 98 (November): 62–72. https://doi.org/10.1016/j.jappgeo.2013.07.005.

Kass, Robert E., Shun-Ichi Amari, Kensuke Arai, Emery N. Brown, Casey O. Diekman, Markus Diesmann, Brent Doiron, et al. 2018. “Computational Neuroscience: Mathematical and Statistical Perspectives.” Annual Review of Statistics and Its Application 5 (1): 183–214. https://doi.org/10.1146/annurev-statistics-041715-033733.

Lauritzen, S L. 1987. “Statistical Manifolds.” In Differential Geometry in Statistical Inference, 164. JSTOR.

Martens, James. 2014. “New Insights and Perspectives on the Natural Gradient Method,” December. http://arxiv.org/abs/1412.1193.

Miolane, Nina, Johan Mathe, Claire Donnat, Mikael Jorda, and Xavier Pennec. 2018. “Geomstats: A Python Package for Riemannian Geometry in Machine Learning,” May. http://arxiv.org/abs/1805.08308.

Mosegaard, Klaus, and Albert Tarantola. 1995. “Monte Carlo Sampling of Solutions to Inverse Problems.” Journal of Geophysical Research 100 (B7): 12431. http://www.gfy.ku.dk/~klaus/ip/MT-1995.pdf.

Nielsen, Frank. 2018. “An Elementary Introduction to Information Geometry,” August. http://arxiv.org/abs/1808.08271.

Palomar, D.P., and S. Verdu. 2008. “Lautum Information.” IEEE Transactions on Information Theory 54 (3): 964–75. https://doi.org/10.1109/TIT.2007.915715.

Poole, Ben, Subhaneil Lahiri, Maithreyi Raghu, Jascha Sohl-Dickstein, and Surya Ganguli. 2016. “Exponential Expressivity in Deep Neural Networks Through Transient Chaos.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3360–8. Curran Associates, Inc. http://papers.nips.cc/paper/6322-exponential-expressivity-in-deep-neural-networks-through-transient-chaos.pdf.

Raginsky, Maxim, and Igal Sason. 2012. “Concentration of Measure Inequalities in Information Theory, Communications and Coding.” Foundations and Trends in Communications and Information Theory, December. http://arxiv.org/abs/1212.4663.