Matrix measure concentration inequalities and bounds

Concentration inequalities for matrix-valued random variables, which are, loosely speaking, promises that some matrix is close to some known value measured in some metric with some probability.

Recommended overviews are J. A. Tropp (2015);van Handel (2017);Vershynin (2018).

Matrix Chernoff

J. A. Tropp (2015) summarises:

In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic.

Are these related?

Nikhil Srivastava’s Discrepancy, Graphs, and the Kadison-Singer Problem has an interesting example of bounds via discrepancy theory (and only indirectly probability). D. Gross (2011) is also readable, and gives results for matrices over the complex field.

Matrix Chebychev

As discussed in, e.g. Paulin, Mackey, and Tropp (2016).

Let \(\mathbf{X} \in \mathbb{H}^{d}\) be a random matrix. For all \(t>0\) \[ \mathbb{P}\{\|\mathbf{X}\| \geq t\} \leq \inf _{p \geq 1} t^{-p} \cdot \mathbb{E}\|\mathbf{X}\|_{S_{p}}^{p} \] Furthermore, \[ \mathbb{E}\|\mathbf{X}\| \leq \inf _{p \geq 1}\left(\mathbb{E}\|\mathbf{X}\|_{S_{p}}^{p}\right)^{1 / p}. \]

Matrix Bernstein

TBC. Bounds the spectral norm.

Matrix Efron-Stein

The “classical” Efron-Stein inequalities are simple. The Matrix ones, not so much

e.g. Paulin, Mackey, and Tropp (2016).


Handy results from Vershynin (2018):

Takes \(X \sim N\left(0, I_{n}\right).\)

Show that, for any fixed vectors \(u, v \in \mathbb{R}^{n},\) we have \[ \mathbb{E}\langle X, u\rangle\langle X, v\rangle=\langle u, v\rangle \]

Given a vector \(u \in \mathbb{R}^{n}\), consider the random variable \(X_{u}:=\langle X, u\rangle .\)

Further, we know that \(X_{u} \sim N\left(0,\|u\|_{2}^{2}\right) .\) It follows that \[ \mathbb{E}\left[(X_{u}-X_{v})^2\right]^{1/2}=\|u-v\|_{2} \] for any fixed vectors \(u, v \in \mathbb{R}^{n} .\)

Grothendieck’s identity: For any fixed vectors \(u, v \in S^{n-1},\) we have \[ \mathbb{E} \operatorname{sign}X_{u} \operatorname{sign}X_{v}=\frac{2}{\pi} \arcsin \langle u, v\rangle. \]


Ahlswede, R., and A. Winter. 2002. Strong Converse for Identification via Quantum Channels.” IEEE Transactions on Information Theory 48 (3): 569–79.
Ando, T. 1995. Matrix Young Inequalities.” In Operator Theory in Function Spaces and Banach Lattices: Essays Dedicated to A.C. Zaanen on the Occasion of His 80th Birthday, edited by C. B. Huijsmans, M. A. Kaashoek, W. A. J. Luxemburg, and B. de Pagter, 33–38. Operator Theory Advances and Applications. Basel: Birkhäuser.
Bach, Francis R. 2013. Sharp Analysis of Low-Rank Kernel Matrix Approximations. In COLT, 30:185–209.
Bakherad, Mojtaba, Mario Krnic, and Mohammad Sal Moslehian. 2016. Reverses of the Young Inequality for Matrices and Operators.” Rocky Mountain Journal of Mathematics 46 (4).
Bhatia, Rajendra. 1997. Matrix Analysis. Vol. 169. Graduate Texts in Mathematics. New York, NY: Springer.
Bhatia, Rajendra, and K. R. Parthasarathy. 2000. Positive Definite Functions and Operator Inequalities.” Bulletin of the London Mathematical Society 32 (2): 214–28.
Bishop, Adrian N., Pierre Del Moral, and Angèle Niclas. 2018. An Introduction to Wishart Matrix Moments.” Foundations and Trends® in Machine Learning 11 (2): 97–218.
Boucheron, Stéphane, Gábor Lugosi, and Pascal Massart. 2013. Concentration Inequalities: A Nonasymptotic Theory of Independence. 1st ed. Oxford: Oxford University Press.
Bousquet, Olivier, Ulrike von Luxburg, and Gunnar Rtsch. 2004. Advanced Lectures on Machine Learning: ML Summer Schools 2003, Canberra, Australia, February 2-14, 2003, T Bingen, Germany, August 4-16, 2003, Revised Lectures. Springer.
Brault, Romain, Florence d’Alché-Buc, and Markus Heinonen. 2016. Random Fourier Features for Operator-Valued Kernels.” In Proceedings of The 8th Asian Conference on Machine Learning, 110–25.
Camino, Juan F., John William Helton, Robert E. Skelton, and Jieping Ye. 2003. “Matrix Inequalities: A Symbolic Procedure to Determine Convexity Automatically.” Integral Equation and Operator Theory 46 (4): 399–454.
Candès, Emmanuel J., and Benjamin Recht. 2009. Exact Matrix Completion via Convex Optimization.” Foundations of Computational Mathematics 9 (6): 717–72.
Conde, Cristian. 2013. “Young Type Inequalities for Positive Operators.” Annals of Functional Analysis 4 (2): 144–52.
Geer, Sara van de. 2014. Statistical Theory for High-Dimensional Models.” arXiv:1409.8557 [Math, Stat], September.
Gross, D. 2011. Recovering Low-Rank Matrices From Few Coefficients in Any Basis.” IEEE Transactions on Information Theory 57 (3): 1548–66.
Gross, David, Yi-Kai Liu, Steven T. Flammia, Stephen Becker, and Jens Eisert. 2010. Quantum State Tomography via Compressed Sensing.” Physical Review Letters 105 (15).
Handel, Ramon van. 2017. Structured Random Matrices.” In Convexity and Concentration, edited by Eric Carlen, Mokshay Madiman, and Elisabeth M. Werner, 107–56. The IMA Volumes in Mathematics and Its Applications. New York, NY: Springer.
Hirzallah, Omar, and Fuad Kittaneh. 2000. Matrix Young Inequalities for the Hilbert–Schmidt Norm.” Linear Algebra and Its Applications 308 (1): 77–84.
Huijsmans, C. B., M. A. Kaashoek, W. A. J. Luxemburg, and B. de Pagter, eds. 1995. Operator Theory in Function Spaces and Banach Lattices: Essays Dedicated to A.C. Zaanen on the Occasion of His 80th Birthday. Basel: Birkhäuser Basel.
Liao, Wenshi, and Junliang Wu. 2015. Reverse Arithmetic-Harmonic Mean and Mixed Mean Operator Inequalities.” Journal of Inequalities and Applications 2015 (1): 215.
Lodhia, Asad, Keith Levin, and Elizaveta Levina. 2021. Matrix Means and a Novel High-Dimensional Shrinkage Phenomenon.” arXiv.
Magnus, Jan R., and Heinz Neudecker. 2019. Matrix differential calculus with applications in statistics and econometrics. 3rd ed. Wiley series in probability and statistics. Hoboken (N.J.): Wiley.
Mahoney, Michael W. 2016. Lecture Notes on Spectral Graph Methods.” arXiv Preprint arXiv:1608.04845.
Marton, Katalin. 2015. Logarithmic Sobolev Inequalities in Discrete Product Spaces: A Proof by a Transportation Cost Distance.” arXiv:1507.02803 [Math], July.
Massart, Pascal. 2007. Concentration Inequalities and Model Selection: Ecole d’Eté de Probabilités de Saint-Flour XXXIII - 2003. Lecture Notes in Mathematics 1896. Berlin ; New York: Springer-Verlag.
Mercer, A. McD. 2000. Bounds for A–G, A–H, G–H, and a Family of Inequalities of Ky Fan’s Type, Using a General Method.” Journal of Mathematical Analysis and Applications 243 (1): 163–73.
Mohri, Mehryar, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. Foundations of Machine Learning. Second edition. Adaptive Computation and Machine Learning. Cambridge, Massachusetts: The MIT Press.
Oliveira, Roberto Imbuzeiro. 2010. The Spectrum of Random \(k\)-Lifts of Large Graphs (with Possibly Large \(k\)).” Journal of Combinatorics 1 (3): 285–306.
Paulin, Daniel, Lester Mackey, and Joel A. Tropp. 2016. Efron–Stein Inequalities for Random Matrices.” The Annals of Probability 44 (5): 3431–73.
Sababheh, Mohammad. 2018. On the Matrix Harmonic Mean.” arXiv.
Sharma, Rajesh. 2008. Some More Inequalities for Arithmetic Mean, Harmonic Mean and Variance.” Journal of Mathematical Inequalities, no. 1: 109–14.
Stam, A. J. 1982. Limit Theorems for Uniform Distributions on Spheres in High-Dimensional Euclidean Spaces.” Journal of Applied Probability 19 (1): 221–28.
Tropp, Joel. 2019. Matrix Concentration & Computational Linear Algebra / ENS Short Course.
Tropp, Joel A. 2015. An Introduction to Matrix Concentration Inequalities.
Uchiyama, Mitsuru. 2020. Some Results on Matrix Means.” Advances in Operator Theory 5 (3): 728–33.
Vershynin, Roman. 2011. Introduction to the Non-Asymptotic Analysis of Random Matrices.” arXiv:1011.3027 [Cs, Math], November.
———. 2015. Estimation in High Dimensions: A Geometric Perspective.” In Sampling Theory, a Renaissance: Compressive Sensing and Other Developments, edited by Götz E. Pfander, 3–66. Applied and Numerical Harmonic Analysis. Cham: Springer International Publishing.
———. 2016. Four Lectures on Probabilistic Methods for Data Science,” December.
———. 2018. High-Dimensional Probability: An Introduction with Applications in Data Science. 1st ed. Cambridge University Press.
Wainwright, Martin. 2019. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics 48. Cambridge ; New York, NY: Cambridge University Press.
Woodruff, David P. 2014. Sketching as a Tool for Numerical Linear Algebra. Foundations and Trends in Theoretical Computer Science 1.0. Now Publishers.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.