Learning on manifolds

Finding the lowest bit of a krazy straw, from the inside

A placeholder for learning on curved spaces. Not discussed: learning OF curved spaces.

Also: learning where there is an a priori manifold seems to also be a usage here? See the work of, e.g. Nina Miolane and collaborators on the Geomstats project.

The below headings may one day be filled in.

See also information criteria.

Information Geometry

The unholy offspring of Fisher information and differential geometry, about which I know little except that it sounds like it should be intuitive. It is probably synonymous with some of the other items on this page if I could sort out all this terminology. See information geometry.

Hamiltonian Monte Carlo

You can also discuss Hamiltonian Monte Carlo in this setting. I will not.

Langevin Monte Carlo

Girolami et al discuss Langevin Monte Carlo in this context.

Natural gradient

See natural gradients.

Homogeneous probability

Albert Tarantola’s framing, from his maybe forthcoming manuscript. How does it relate to information geometry? I don’t know yet. Haven’t had time to read. Also not a common phrasing, which is a danger sign.


Absil, P.-A, R Mahony, and R Sepulchre. 2008. Optimization Algorithms on Matrix Manifolds. Princeton, N.J.; Woodstock: Princeton University Press.
Amari, Shun-ichi. 1998. “Natural Gradient Works Efficiently in Learning.” Neural Computation 10 (2): 251–76. https://doi.org/10.1162/089976698300017746.
Amari, Shunʼichi. 1987. “Differential Geometrical Theory of Statistics.” In Differential Geometry in Statistical Inference, 19–94.
———. 2001. “Information Geometry on Hierarchy of Probability Distributions.” IEEE Transactions on Information Theory 47: 1701–11. https://doi.org/10.1109/18.930911.
Aswani, Anil, Peter Bickel, and Claire Tomlin. 2011. “Regression on Manifolds: Estimation of the Exterior Derivative.” The Annals of Statistics 39 (1): 48–81. https://doi.org/10.1214/10-AOS823.
Barndorff-Nielsen, O E. 1987. “Differential and Integral Geometry in Statistical Inference.” In Differential Geometry in Statistical Inference. Sn Aarhus.
Betancourt, Michael, Simon Byrne, Sam Livingstone, and Mark Girolami. 2017. “The Geometric Foundations of Hamiltonian Monte Carlo.” Bernoulli 23 (November): 2257–98. https://doi.org/10.3150/16-BEJ810.
Borovitskiy, Viacheslav, Alexander Terenin, Peter Mostowsky, and Marc Peter Deisenroth. 2020. “Matern Gaussian Processes on Riemannian Manifolds.” June 17, 2020. http://arxiv.org/abs/2006.10160.
Boumal, Nicolas. 2013. “On Intrinsic Cramér-Rao Bounds for Riemannian Submanifolds and Quotient Manifolds.” IEEE Transactions on Signal Processing 61 (7): 1809–21. https://doi.org/10.1109/TSP.2013.2242068.
Boumal, Nicolas, Bamdev Mishra, P.-A. Absil, and Rodolphe Sepulchre. 2014. “Manopt, a Matlab Toolbox for Optimization on Manifolds.” Journal of Machine Learning Research 15: 1455–59. http://jmlr.org/papers/v15/boumal14a.html.
Boumal, Nicolas, Amit Singer, P.-A. Absil, and Vincent D. Blondel. 2014. “Cramér-Rao Bounds for Synchronization of Rotations.” Information and Inference 3 (1): 1–39. https://doi.org/10.1093/imaiai/iat006.
Carlsson, Gunnar, Tigran Ishkhanov, Vin de Silva, and Afra Zomorodian. 2008. “On the Local Behavior of Spaces of Natural Images.” International Journal of Computer Vision 76 (1): 1–12. https://doi.org/10.1007/s11263-007-0056-x.
Chen, Minhua, J. Silva, J. Paisley, Chunping Wang, D. Dunson, and L. Carin. 2010. “Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds.” IEEE Transactions on Signal Processing 58 (12): 6140–55. https://doi.org/10.1109/TSP.2010.2070796.
Fernández-Martínez, J. L., Z. Fernández-Muñiz, J. L. G. Pallero, and L. M. Pedruelo-González. 2013. “From Bayes to Tarantola: New Insights to Understand Uncertainty in Inverse Problems.” Journal of Applied Geophysics 98 (November): 62–72. https://doi.org/10.1016/j.jappgeo.2013.07.005.
Ge, Rong, and Tengyu Ma. 2017. “On the Optimization Landscape of Tensor Decompositions.” In Advances In Neural Information Processing Systems. http://arxiv.org/abs/1706.05598.
Girolami, Mark, and Ben Calderhead. 2011. “Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (2): 123–214. https://doi.org/10.1111/j.1467-9868.2010.00765.x.
Hosseini, Reshad, and Suvrit Sra. 2015. “Manifold Optimization for Gaussian Mixture Models.” 2015. http://arxiv.org/abs/1506.07677.
Lauritzen, S L. 1987. “Statistical Manifolds.” In Differential Geometry in Statistical Inference, 164. JSTOR.
Manton, Jonathan H. 2013. “A Primer on Stochastic Differential Geometry for Signal Processing.” IEEE Journal of Selected Topics in Signal Processing 7 (4): 681–99. https://doi.org/10.1109/JSTSP.2013.2264798.
Miolane, Nina, Johan Mathe, Claire Donnat, Mikael Jorda, and Xavier Pennec. 2018. “Geomstats: A Python Package for Riemannian Geometry in Machine Learning.” May 21, 2018. http://arxiv.org/abs/1805.08308.
Mosegaard, Klaus, and Albert Tarantola. 1995. “Monte Carlo Sampling of Solutions to Inverse Problems.” Journal of Geophysical Research 100 (B7): 12431. http://www.gfy.ku.dk/ klaus/ip/MT-1995.pdf.
Mukherjee, Sayan, Qiang Wu, and Ding-Xuan Zhou. 2010. “Learning Gradients on Manifolds.” Bernoulli 16 (1): 181–207. https://doi.org/10.3150/09-BEJ206.
Peters, Jan. 2010. “Policy Gradient Methods.” Scholarpedia 5 (11): 3698. https://doi.org/10.4249/scholarpedia.3698.
Seshadhri, C., Aneesh Sharma, Andrew Stolman, and Ashish Goel. 2020. “The Impossibility of Low-Rank Representations for Triangle-Rich Complex Networks.” Proceedings of the National Academy of Sciences 117 (11): 5631–37. https://doi.org/10.1073/pnas.1911030117.
Steinke, Florian, and Matthias Hein. 2009. “Non-Parametric Regression Between Manifolds.” In Advances in Neural Information Processing Systems 21, 1561–68. Curran Associates, Inc. http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2008_0692.pdf.
Townsend, James, Niklas Koep, and Sebastian Weichwald. 2016. “Pymanopt: A Python Toolbox for Optimization on Manifolds Using Automatic Differentiation.” Journal of Machine Learning Research 17 (137): 1–5. http://jmlr.org/papers/v17/16-177.html.
Transtrum, Mark K, Benjamin B Machta, and James P Sethna. 2011. “The Geometry of Nonlinear Least Squares with Applications to Sloppy Models and Optimization.” Physical Review E 83 (3): 036701. https://doi.org/10.1103/PhysRevE.83.036701.
Wang, Yu Guang, and Xiaosheng Zhuang. 2016. “Tight Framelets and Fast Framelet Transforms on Manifolds.” August 13, 2016. http://arxiv.org/abs/1608.04026.
Xifara, T., C. Sherlock, S. Livingstone, S. Byrne, and M. Girolami. 2014. “Langevin Diffusions and the Metropolis-Adjusted Langevin Algorithm.” Statistics & Probability Letters 91 (August): 14–19. https://doi.org/10.1016/j.spl.2014.04.002.