A placeholder for learning on curved spaces. Not discussed: learning OF the curvature of spaces.

AFAICT this usually boils down to defining an appropriate stochastic process on a manifold.

## Learning on a given manifold

Learning where there is an *a priori* manifold seems to also be a usage here? For example the manifold of positive definite matrices is treated in depth in Chikuse and 筑瀬 (2003).

See the work of, e.g.

Manifold optimisation implementations:

- pytorch: Lezcano/geotorch: Constrained optimization toolkit for PyTorch (Lezcano Casado 2019)
- MATLAB: manopt,
- Python: pymanopt.
- Julia: Manopt.jl
- Python: Nina Miolane et al’s Geomstats project.
- C++: ROPTLIB (Huang et al. 2018)
- R: ManifoldOptim wrapes ROPTLIB (Martin et al. 2016)

There are at least two textbooks online:

## Information Geometry

The unholy offspring of Fisher information and differential geometry, about which I know little except that it sounds like it should be intuitive. It is probably synonymous with some of the other items on this page if I could sort out all this terminology. See information geometry.

## Hamiltonian Monte Carlo

You can also discuss Hamiltonian Monte Carlo in this setting. I will not.

## Langevin Monte Carlo

Girolami et al discuss Langevin Monte Carlo in this context.

## Natural gradient

See natural gradients.

## Homogeneous probability

Albert Tarantola’s framing, from his manuscript. How does it relate to information geometry? I don’t know yet. Haven’t had time to read. Also not a common phrasing, which is a danger sign.

## Incoming

- Agustinus Kristiadi, Fisher Information Matrix
- Agustinus Kristiadi, Hessian and Curvatures in Machine Learning: A Differential-Geometric View
- Agustinus Kristiadi, Notes on Riemannian Geometry
- Agustinus Kristiadi, Optimization and Gradient Descent on Riemannian Manifolds

## References

*Optimization algorithms on matrix manifolds*. Princeton, N.J.; Woodstock: Princeton University Press.

*Neural Computation*10 (2): 251–76.

*Differential Geometry in Statistical Inference*, 19–94.

*IEEE Transactions on Information Theory*47: 1701–11.

*The Annals of Statistics*39 (1): 48–81.

*Differential Geometry in Statistical Inference*. Sn Aarhus.

*Bernoulli*23 (4A): 2257–98.

*arXiv:2006.10160 [Cs, Stat]*, June.

*IEEE Transactions on Signal Processing*61 (7): 1809–21.

*Journal of Machine Learning Research*15: 1455–59.

*Information and Inference*3 (1): 1–39.

*International Journal of Computer Vision*76 (1): 1–12.

*IEEE Transactions on Signal Processing*58 (12): 6140–55.

*Statistics on Special Manifolds*. New York, NY: Springer New York.

*Journal of Applied Geophysics*98 (November): 62–72.

*Advances In Neural Information Processing Systems*.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*73 (2): 123–214.

*arXiv:2104.05508 [Cs, Stat]*, April.

*arXiv Preprint arXiv:1506.07677*.

*ACM Transactions on Mathematical Software*44 (4): 43:1–21.

*Differential Geometry in Statistical Inference*, 164. JSTOR.

*Annual Review of Statistics and Its Application*8 (1): 369–91.

*Advances in Neural Information Processing Systems*. Vol. 32. Curran Associates, Inc.

*IEEE Journal of Selected Topics in Signal Processing*7 (4): 681–99.

*Directional Statistics*. John Wiley & Sons.

*arXiv:1805.08308 [Cs, Stat]*, May.

*Journal of Geophysical Research*100 (B7): 12431.

*Bernoulli*16 (1): 181–207.

*Scholarpedia*5 (11): 3698.

*Proceedings of the National Academy of Sciences*117 (11): 5631–37.

*Journal of Applied Probability*19 (1): 221–28.

*Advances in Neural Information Processing Systems 21*, 1561–68. Curran Associates, Inc.

*Journal of Machine Learning Research*17 (137): 1–5.

*Physical Review E*83 (3): 036701.

*arXiv:1608.04026 [Math]*, August.

*Statistics & Probability Letters*91 (Supplement C): 14–19.

## No comments yet. Why not leave one?