Doing inference where the probability metric measuring discrepancy between some target distribution and the implied inferential distribution is an optimal-transport one. Frequently intractable, but neat when we can get it.

Wasserstein GANs are argued to do an approximate optimal transport inference, indirectly. See e.g. (J. H. Huggins et al. 2018b, 2018a) for a particular Bayes posterior approximation using OT.

## Tools

### OTT

Optimal Transport Tools (OTT), a toolbox for all things Wasserstein (documentation):

The goal of OTT is to provide sturdy, versatile and efficient optimal transport solvers, taking advantage of JAX features, such as JIT, auto-vectorization and implicit differentiation.

A typical OT problem has two ingredients: a pair of weight vectors

`a`

and`b`

(one for each measure), with a ground cost matrix that is either directly given, or derived as the pairwise evaluation of a cost function on pairs of points taken from two measures. The main design choice in OTT comes from encapsulating the cost in a`Geometry`

object, and bundle it with a few useful operations (notably kernel applications). The most common geometry is that of two clouds of vectors compared with the squared Euclidean distance, as illustrated in the example below:

A self-contained example of this in action:

```
import jax
import jax.numpy as jnp
from ott.tools import transport
# Samples two point clouds and their weights.
rngs = jax.random.split(jax.random.PRNGKey(0),4)
n, m, d = 12, 14, 2
x = jax.random.normal(rngs[0], (n,d)) + 1
y = jax.random.uniform(rngs[1], (m,d))
a = jax.random.uniform(rngs[2], (n,))
b = jax.random.uniform(rngs[3], (m,))
a, b = a / jnp.sum(a), b / jnp.sum(b)
# Computes the couplings via Sinkhorn algorithm.
ot = transport.solve(x, y, a=a, b=b)
P = ot.matrix
```

The call to

`sinkhorn`

above works out the optimal transport solution by storing its output. The transport matrix can be instantiated using those optimal solutions and the`Geometry`

again. That transport matrix links each point from the first point cloud to one or more points from the second, as illustrated below.To be more precise, the

`sinkhorn`

algorithm operates on the`Geometry`

, taking into account weights`a`

and`b`

, to solve the OT problem, produce a named tuple that contains two optimal dual potentials`f`

and`g`

(vectors of the same size as`a`

and`b`

), the objective`reg_ot_cost`

and a log of the`errors`

of the algorithm as it converges, and a`converged`

flag.

### POT

## Incoming

- Rigollet and Weed (2018): >We give a statistical interpretation of entropic optimal trans port by showing that performing maximum-likelihood estimation for Gaussian deconvolution corresponds to calculating a projection with respect to the entropic optimal transport distance.

## References

*SIAM Journal on Mathematical Analysis*43 (2): 904–24.

*Advances in Neural Information Processing Systems*32.

*arXiv:1811.02827 [Cs, Stat]*, November.

*Proceedings of the 32Nd International Conference on Neural Information Processing Systems*, 2478–87. NIPS’18. USA: Curran Associates Inc.

*Gradient Flows: In Metric Spaces and in the Space of Probability Measures*. 2nd ed. Lectures in Mathematics. ETH Zürich. Birkhäuser Basel.

*SIAM Journal on Mathematical Analysis*35 (1): 61–97.

*International Conference on Machine Learning*, 214–23.

*arXiv:1703.00573 [Cs]*, March.

*arXiv:1805.00753 [Stat]*, April.

*Acta Numerica*30 (May): 249–325.

*arXiv:1412.5154 [Math]*, December.

*UAI18*.

*IFAC Proceedings Volumes*, 19th IFAC World Congress, 47 (3): 8662–68.

*arXiv:1802.04885 [Stat]*, February.

*arXiv:1810.07717 [Cs]*, October.

*arXiv:1610.05627 [Math, Stat]*, October.

*arXiv:1906.01614 [Math, Stat]*, June.

*arXiv:1810.02403 [Math]*, October.

*AISTATS 2018*.

*Electronic Journal of Probability*16 (none).

*arXiv:1209.1077 [Cs, Stat]*, September.

*arXiv:1607.05816 [Math]*, May.

*ICML*.

*arXiv:2102.07850 [Cs, Stat]*, June.

*arXiv:1507.00504 [Cs]*, June.

*Advances in Neural Information Processing Systems 26*.

*International Conference on Machine Learning*, 685–93. PMLR.

*von Mises calculus for statistical functionals*. Lecture Notes in Statistics 19. New York: Springer.

*Machine Learning*107 (12): 1923–45.

*Advances in Neural Information Processing Systems 28*, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2053–61. Curran Associates, Inc.

*arXiv:1604.02199 [Math]*, April.

*SIAM Journal on Applied Dynamical Systems*19 (1): 412–41.

*Advances in Neural Information Processing Systems 29*, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3432–40. Curran Associates, Inc.

*arXiv:1706.00292 [Stat]*, October.

*Advances in Neural Information Processing Systems 27*, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 2672–80. NIPS’14. Cambridge, MA, USA: Curran Associates, Inc.

*arXiv:1003.3852 [Math]*, March.

*arXiv:1704.00028 [Cs, Stat]*, March.

*arXiv:1705.07164 [Cs, Stat]*, May.

*arXiv:1806.10234 [Cs, Stat]*, June.

*arXiv:1809.09505 [Cs, Math, Stat]*, September.

*arXiv:1910.04102 [Cs, Math, Stat]*, October.

*Advances in Neural Information Processing Systems 30*, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 3611–21. Curran Associates, Inc.

*Information Geometry*, June.

*Discrete & Continuous Dynamical Systems - A*34 (4): 1533.

*International Conference on Machine Learning*, 3159–68.

*Advances In Neural Information Processing Systems*.

*PMLR*, 2218–27.

*arXiv:1906.03317 [Cs, Math, Stat]*, June.

*Information Geometry*, August.

*Handbook of Uncertainty Quantification*, edited by Roger Ghanem, David Higdon, and Houman Owhadi, 1–41. Cham: Springer International Publishing.

*SIAM/ASA Journal on Uncertainty Quantification*, February, 96–124.

*Mathematical Programming*171 (1): 115–66.

*Advances in Neural Information Processing Systems 29*, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3711–19. Curran Associates, Inc.

*Annual Review of Statistics and Its Application*6 (1): 405–31.

*Computational Optimal Transport*. Vol. 11.

*International Conference on Machine Learning*, 2664–72. PMLR.

*The 22nd International Conference on Artificial Intelligence and Statistics*, 849–58. PMLR.

*International Conference on Machine Learning*, 1530–38. ICML’15. Lille, France: JMLR.org.

*arXiv:1901.03227 [Cs, Stat]*, January.

*Optimal Transport for Applied Mathematicians*. Edited by Filippo Santambrogio. Progress in Nonlinear Differential Equations and Their Applications. Cham: Springer International Publishing.

*arXiv:1610.06519 [Cs, Math]*, February.

*ACM Transactions on Graphics*34 (4): 66:1–11.

*Journal of Machine Learning Research*19 (66): 2639–709.

*Electronic Journal of Statistics*13 (2): 5088–5119.

*Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)*, 284–94. Minneapolis, Minnesota: Association for Computational Linguistics.

*Proceedings of NeurIPS 2020*.

*IEEE Transactions on Information Theory*66 (11): 7155–79.

## No comments yet. Why not leave one?