Optimal transport inference

I feel the earth mover under my feet, I feel the ψ tumbling down, I feel my heart start to trembling, Whenever you’re around my empirical density in minimal transport cost

Doing inference where the probability metric is an optimal-transport one. Usually intractable, but desirable when we can get it.

Wasserstein GANs are argued to approximate this.

See e.g. (J. H. Huggins et al. 2018b, 2018a) for a particular Bayes posterior approximation to this.



Agueh, Martial, and Guillaume Carlier. 2011. “Barycenters in the Wasserstein Space.” SIAM Journal on Mathematical Analysis 43 (2): 904–24. https://doi.org/10.1137/100805741.
Alaya, Mokhtar Z., Maxime Berar, Gilles Gasso, and Alain Rakotomamonjy. 2019. “Screening Sinkhorn Algorithm for Regularized Optimal Transport.” Advances in Neural Information Processing Systems 32. https://papers.nips.cc/paper/2019/hash/95688ba636a4720a85b3634acfec8cdd-Abstract.html.
Altschuler, Jason, Jonathan Niles-Weed, and Philippe Rigollet. n.d. “Near-Linear Time Approximation Algorithms for Optimal Transport via Sinkhorn Iteration,” 11.
Ambrogioni, Luca, Umut Guclu, and Marcel van Gerven. 2018. “Wasserstein Variational Gradient Descent: From Semi-Discrete Optimal Transport to Ensemble Variational Inference.” arXiv:1811.02827 [cs, Stat], November. http://arxiv.org/abs/1811.02827.
Ambrogioni, Luca, Umut Güçlü, Yagmur Güçlütürk, Max Hinne, Eric Maris, and Marcel A. J. van Gerven. 2018. “Wasserstein Variational Inference.” In Proceedings of the 32Nd International Conference on Neural Information Processing Systems, 2478–87. NIPS’18. USA: Curran Associates Inc. http://arxiv.org/abs/1805.11284.
Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savare. 2008. Gradient Flows: In Metric Spaces and in the Space of Probability Measures. 2nd ed. Lectures in Mathematics. ETH Zürich. Birkhäuser Basel. https://www.springer.com/gp/book/9783764387211.
Angenent, Sigurd, Steven Haker, and Allen Tannenbaum. 2003. “Minimizing Flows for the Monge–Kantorovich Problem.” SIAM Journal on Mathematical Analysis 35 (1): 61–97. https://doi.org/10.1137/S0036141002410927.
Arjovsky, Martin, Soumith Chintala, and Léon Bottou. 2017. “Wasserstein Generative Adversarial Networks.” In International Conference on Machine Learning, 214–23. http://proceedings.mlr.press/v70/arjovsky17a.html.
Arora, Sanjeev, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. 2017. “Generalization and Equilibrium in Generative Adversarial Nets (GANs).” arXiv:1703.00573 [cs], March. http://arxiv.org/abs/1703.00573.
Bachoc, Francois, Alexandra Suvorikova, David Ginsbourger, Jean-Michel Loubes, and Vladimir Spokoiny. 2019. “Gaussian Processes with Multidimensional Distribution Inputs via Optimal Transport and Hilbertian Embedding.” arXiv:1805.00753 [stat], April. http://arxiv.org/abs/1805.00753.
Benamou, Jean-David, Guillaume Carlier, Marco Cuturi, Luca Nenna, and Gabriel Peyré. 2014. “Iterative Bregman Projections for Regularized Transportation Problems.” arXiv:1412.5154 [math], December. http://arxiv.org/abs/1412.5154.
Berg, Rianne van den, Leonard Hasenclever, Jakub M. Tomczak, and Max Welling. 2018. “Sylvester Normalizing Flows for Variational Inference.” In Uai18. http://arxiv.org/abs/1803.05649.
Blanchet, Jose, Lin Chen, and Xun Yu Zhou. 2018. “Distributionally Robust Mean-Variance Portfolio Selection with Wasserstein Distances.” arXiv:1802.04885 [stat], February. http://arxiv.org/abs/1802.04885.
Blanchet, Jose, Arun Jambulapati, Carson Kent, and Aaron Sidford. 2018. “Towards Optimal Running Times for Optimal Transport.” arXiv:1810.07717 [cs], October. http://arxiv.org/abs/1810.07717.
Blanchet, Jose, Yang Kang, and Karthyek Murthy. 2016. “Robust Wasserstein Profile Inference and Applications to Machine Learning.” arXiv:1610.05627 [math, Stat], October. http://arxiv.org/abs/1610.05627.
Blanchet, Jose, Karthyek Murthy, and Nian Si. 2019. “Confidence Regions in Wasserstein Distributionally Robust Estimation.” arXiv:1906.01614 [math, Stat], June. http://arxiv.org/abs/1906.01614.
Blanchet, Jose, Karthyek Murthy, and Fan Zhang. 2018. “Optimal Transport Based Distributionally Robust Optimization: Structural Properties and Iterative Schemes.” arXiv:1810.02403 [math], October. http://arxiv.org/abs/1810.02403.
Blondel, Mathieu, Vivien Seguy, and Antoine Rolet. 2018. “Smooth and Sparse Optimal Transport.” In AISTATS 2018. http://arxiv.org/abs/1710.06276.
Boissard, Emmanuel. 2011. “Simple Bounds for the Convergence of Empirical and Occupation Measures in 1-Wasserstein Distance.” Electronic Journal of Probability 16 (none). https://doi.org/10.1214/EJP.v16-958.
Bonneel, Nicolas. n.d. “Displacement Interpolation Using Lagrangian Mass Transport,” 11.
Canas, Guillermo D., and Lorenzo Rosasco. 2012. “Learning Probability Measures with Respect to Optimal Transport Metrics.” arXiv:1209.1077 [cs, Stat], September. http://arxiv.org/abs/1209.1077.
Carlier, Guillaume, Marco Cuturi, Brendan Pass, and Carola Schoenlieb. 2017. “Optimal Transport Meets Probability, Statistics and Machine Learning,” 9.
Chizat, Lenaic, Gabriel Peyré, Bernhard Schmitzer, and François-Xavier Vialard. 2017. “Scaling Algorithms for Unbalanced Transport Problems.” arXiv:1607.05816 [math], May. http://arxiv.org/abs/1607.05816.
Chu, Casey, Jose Blanchet, and Peter Glynn. 2019. “Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning.” In ICML. http://arxiv.org/abs/1901.10691.
Corenflos, Adrien, James Thornton, George Deligiannidis, and Arnaud Doucet. 2021. “Differentiable Particle Filtering via Entropy-Regularized Optimal Transport.” arXiv:2102.07850 [cs, Stat], June. http://arxiv.org/abs/2102.07850.
Coscia, Michele. 2020. “Generalized Euclidean Measure to Estimate Network Distances,” 11.
Courty, Nicolas, Rémi Flamary, Devis Tuia, and Alain Rakotomamonjy. 2016. “Optimal Transport for Domain Adaptation.” arXiv:1507.00504 [cs], June. http://arxiv.org/abs/1507.00504.
Cuturi, Marco. 2013. “Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances.” In Advances in Neural Information Processing Systems 26. https://arxiv.org/abs/1306.0895v1.
Cuturi, Marco, and Arnaud Doucet. 2014. “Fast Computation of Wasserstein Barycenters.” In International Conference on Machine Learning, 685–93. PMLR. http://proceedings.mlr.press/v32/cuturi14.html.
Fernholz, Luisa Turrin. 1983. von Mises calculus for statistical functionals. Lecture Notes in Statistics 19. New York: Springer.
Flamary, Remi, Alain Rakotomamonjy, Nicolas Courty, and Devis Tuia. n.d. “Optimal Transport with Laplacian Regularization,” 10.
Flamary, Rémi, Marco Cuturi, Nicolas Courty, and Alain Rakotomamonjy. 2018. “Wasserstein Discriminant Analysis.” Machine Learning 107 (12): 1923–45. https://doi.org/10.1007/s10994-018-5717-1.
Frogner, Charlie, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya, and Tomaso A Poggio. 2015. “Learning with a Wasserstein Loss.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2053–61. Curran Associates, Inc. http://papers.nips.cc/paper/5679-learning-with-a-wasserstein-loss.pdf.
Gao, Rui, and Anton J. Kleywegt. 2016. “Distributionally Robust Stochastic Optimization with Wasserstein Distance.” arXiv:1604.02199 [math], April. http://arxiv.org/abs/1604.02199.
Garbuno-Inigo, Alfredo, Franca Hoffmann, Wuchen Li, and Andrew M. Stuart. 2020. “Interacting Langevin Diffusions: Gradient Structure and Ensemble Kalman Sampler.” SIAM Journal on Applied Dynamical Systems 19 (1): 412–41. https://doi.org/10.1137/19M1251655.
Genevay, Aude, Marco Cuturi, Gabriel Peyré, and Francis Bach. 2016. “Stochastic Optimization for Large-Scale Optimal Transport.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3432–40. Curran Associates, Inc. http://papers.nips.cc/paper/6566-stochastic-optimization-for-large-scale-optimal-transport.pdf.
Genevay, Aude, Gabriel Peyré, and Marco Cuturi. 2017. “Learning Generative Models with Sinkhorn Divergences.” arXiv:1706.00292 [stat], October. http://arxiv.org/abs/1706.00292.
Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial Nets.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 2672–80. NIPS’14. Cambridge, MA, USA: Curran Associates, Inc. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
Gozlan, Nathael, and Christian Léonard. 2010. “Transport Inequalities. A Survey.” arXiv:1003.3852 [math], March. http://arxiv.org/abs/1003.3852.
Gulrajani, Ishaan, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. “Improved Training of Wasserstein GANs.” arXiv:1704.00028 [cs, Stat], March. http://arxiv.org/abs/1704.00028.
Guo, Xin, Johnny Hong, Tianyi Lin, and Nan Yang. 2017. “Relaxed Wasserstein with Applications to GANs.” arXiv:1705.07164 [cs, Stat], May. http://arxiv.org/abs/1705.07164.
Huggins, Jonathan H., Trevor Campbell, Mikołaj Kasprzak, and Tamara Broderick. 2018a. “Scalable Gaussian Process Inference with Finite-Data Mean and Variance Guarantees.” arXiv:1806.10234 [cs, Stat], June. http://arxiv.org/abs/1806.10234.
———. 2018b. “Practical Bounds on the Error of Bayesian Posterior Approximations: A Nonasymptotic Approach.” arXiv:1809.09505 [cs, Math, Stat], September. http://arxiv.org/abs/1809.09505.
Huggins, Jonathan H., Mikołaj Kasprzak, Trevor Campbell, and Tamara Broderick. 2019. “Practical Posterior Error Bounds from Variational Objectives.” arXiv:1910.04102 [cs, Math, Stat], October. http://arxiv.org/abs/1910.04102.
Huggins, Jonathan, Ryan P Adams, and Tamara Broderick. 2017. “PASS-GLM: Polynomial Approximate Sufficient Statistics for Scalable Bayesian GLM Inference.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 3611–21. Curran Associates, Inc. http://papers.nips.cc/paper/6952-pass-glm-polynomial-approximate-sufficient-statistics-for-scalable-bayesian-glm-inference.pdf.
Kim, Jin W., and Prashant G. Mehta. 2019. “An Optimal Control Derivation of Nonlinear Smoothing Equations,” April. https://arxiv.org/abs/1904.01710v1.
Léonard, Christian. 2014. “A Survey of the Schrödinger Problem and Some of Its Connections with Optimal Transport.” Discrete & Continuous Dynamical Systems - A 34 (4): 1533. https://doi.org/10.3934/dcds.2014.34.1533.
Liu, Huidong, Xianfeng Gu, and Dimitris Samaras. 2018. “A Two-Step Computation of the Exact GAN Wasserstein Distance.” In International Conference on Machine Learning, 3159–68. http://proceedings.mlr.press/v80/liu18d.html.
Liu, Qiang, and Dilin Wang. 2019. “Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm.” In Advances In Neural Information Processing Systems. http://arxiv.org/abs/1608.04471.
Louizos, Christos, and Max Welling. 2017. “Multiplicative Normalizing Flows for Variational Bayesian Neural Networks.” In PMLR, 2218–27. http://proceedings.mlr.press/v70/louizos17a.html.
Mahdian, Saied, Jose Blanchet, and Peter Glynn. 2019. “Optimal Transport Relaxations with Application to Wasserstein GANs.” arXiv:1906.03317 [cs, Math, Stat], June. https://arxiv.org/abs/1906.03317v1.
Marzouk, Youssef, Tarek Moselhy, Matthew Parno, and Alessio Spantini. 2016. “Sampling via Measure Transport: An Introduction.” In Handbook of Uncertainty Quantification, edited by Roger Ghanem, David Higdon, and Houman Owhadi, 1–41. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-11259-6_23-1.
Maurya, Abhinav. 2018. “Optimal Transport in Statistical Machine Learning : Selected Review and Some Open Questions.” In.
Mohajerin Esfahani, Peyman, and Daniel Kuhn. 2018. “Data-Driven Distributionally Robust Optimization Using the Wasserstein Metric: Performance Guarantees and Tractable Reformulations.” Mathematical Programming 171 (1): 115–66. https://doi.org/10.1007/s10107-017-1172-1.
Montavon, Grégoire, Klaus-Robert Müller, and Marco Cuturi. 2016. “Wasserstein Training of Restricted Boltzmann Machines.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3711–19. Curran Associates, Inc. http://papers.nips.cc/paper/6248-wasserstein-training-of-restricted-boltzmann-machines.pdf.
Ostrovski, Georg, Will Dabney, and Remi Munos. n.d. “Autoregressive Quantile Networks for Generative Modeling,” 10.
Panaretos, Victor M., and Yoav Zemel. 2019. “Statistical Aspects of Wasserstein Distances.” Annual Review of Statistics and Its Application 6 (1): 405–31. https://doi.org/10.1146/annurev-statistics-030718-104938.
Perrot, Michaël, Nicolas Courty, Rémi Flamary, and Amaury Habrard. n.d. “Mapping Estimation for Discrete Optimal Transport,” 9.
Peyré, Gabriel, and Marco Cuturi. 2019. Computational Optimal Transport. Vol. 11. https://doi.org/10.1561/2200000073.
Peyré, Gabriel, Marco Cuturi, and Justin Solomon. 2016. “Gromov-Wasserstein Averaging of Kernel and Distance Matrices.” In International Conference on Machine Learning, 2664–72. PMLR. http://proceedings.mlr.press/v48/peyre16.html.
Redko, Ievgen, Nicolas Courty, Rémi Flamary, and Devis Tuia. 2019. “Optimal Transport for Multi-Source Domain Adaptation Under Target Shift.” In The 22nd International Conference on Artificial Intelligence and Statistics, 849–58. PMLR. http://proceedings.mlr.press/v89/redko19a.html.
Rezende, Danilo Jimenez, and Shakir Mohamed. 2015. “Variational Inference with Normalizing Flows.” In International Conference on Machine Learning, 1530–38. ICML’15. Lille, France: JMLR.org. http://arxiv.org/abs/1505.05770.
Rustamov, Raif M. 2019. “Closed-Form Expressions for Maximum Mean Discrepancy with Applications to Wasserstein Auto-Encoders.” arXiv:1901.03227 [cs, Stat], January. http://arxiv.org/abs/1901.03227.
Santambrogio, Filippo. 2015. Optimal Transport for Applied Mathematicians. Edited by Filippo Santambrogio. Progress in Nonlinear Differential Equations and Their Applications. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-20828-2_1.
Schmitzer, Bernhard. 2019. “Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems.” arXiv:1610.06519 [cs, Math], February. http://arxiv.org/abs/1610.06519.
Solomon, Justin, Fernando de Goes, Gabriel Peyré, Marco Cuturi, Adrian Butscher, Andy Nguyen, Tao Du, and Leonidas Guibas. 2015. “Convolutional Wasserstein Distances: Efficient Optimal Transportation on Geometric Domains.” ACM Transactions on Graphics 34 (4): 66:1–11. https://doi.org/10.1145/2766963.
Spantini, Alessio, Daniele Bigoni, and Youssef Marzouk. 2017. “Inference via Low-Dimensional Couplings.” Journal of Machine Learning Research 19 (66): 2639–709. http://arxiv.org/abs/1703.06131.
Taghvaei, Amirhossein, and Prashant G. Mehta. 2019. “An Optimal Transport Formulation of the Ensemble Kalman Filter,” October. https://arxiv.org/abs/1910.02338v1.
Verdinelli, Isabella, and Larry Wasserman. 2019. “Hybrid Wasserstein distance and fast distribution clustering.” Electronic Journal of Statistics 13 (2): 5088–5119. https://doi.org/10.1214/19-EJS1639.
Wang, Prince Zizhuang, and William Yang Wang. 2019. “Riemannian Normalizing Flow on Variational Wasserstein Autoencoder for Text Modeling.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 284–94. Minneapolis, Minnesota: Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1025.
Zhang, Rui, Christian Walder, Edwin V. Bonilla, Marian-Andrei Rizoiu, and Lexing Xie. 2020. “Quantile Propagation for Wasserstein-Approximate Gaussian Processes.” In Proceedings of NeurIPS 2020. http://arxiv.org/abs/1912.10200.
Zhu, B., J. Jiao, and D. Tse. 2020. “Deconstructing Generative Adversarial Networks.” IEEE Transactions on Information Theory 66 (11): 7155–79. https://doi.org/10.1109/TIT.2020.2983698.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.