Randomized low dimensional projections



One way I can get at the confusing behaviours of high dimensional distributions is to instead look at low dimensional projections of them. If I have a (possibly fixed) data matrix and a random dimensional projection, what distribution does the projection have?

This idea pertains to many others: matrix factorisations, restricted isometry properties, Riesz bases, randomised regression, compressed sensing. You could also consider these results as arising from/resulting in certain structured random matrices.

Tutorials

There is a confusing note soup here, sorry. You might find it better to read a coherent overview such as Meckes’ lecture slides which include a lot of important recent developments, many of which she invented:

Related: Weird slicing problems in convex geometry. For a theoretical background as to how that relates, see Guédon (2014).

Inner products

Djalil Chafaï introduces the The Funk-Hecke formula, also mentioned under isotropic RVs, which gives us a formula for a particularly simple case, of unit-norm RVs:

… if \(X\) is a random vector of \(\mathbb{R}^{n}\) uniformly distributed on \(\mathbb{S}^{n-1}\) then for all \(y \in \mathbb{S}^{n-1}\), the law of \(X \cdot y\) has density \[ t \in[-1,1] \mapsto \frac{\Gamma\left(\frac{n}{2}\right)}{\sqrt{\pi} \Gamma\left(\frac{n-1}{2}\right)}\left(1-t^{2}\right)^{\frac{n-3}{2}} \] This law does not depend on the choice of \(y \in \mathbb{S}^{n-1}\). It is symmetric in the sense that \(X \cdot y\) and \(-(X \cdot y)\) have same law. The law of \(|X \cdot y|\) is the image of the law \(\operatorname{Beta}\left(\frac{3}{2}, \frac{n-1}{2}\right)\) by the map \(u \mapsto \sqrt{u}\). The law of \(X \cdot y\) is

  • if \(n=2\) : an arcsine law,
  • if \(n=3\) : a uniform law (Archimedes principle),
  • if \(n=4\) : a semicircle law.

whuber asserts that \((X \cdot y+1)\sim\operatorname{Beta}((d-1)/2,(d-1)/2),\) and \(\operatorname{Var}(X \cdot y)=1/d.\)

Random projections are kinda Gaussian

More generally things are not so exact. But they are still reasonably nice, in that there are lots of tasty limit theorems with nice regular behaviour.

A classic introductory concept: Diaconis-Freedman effect. Diaconis and Freedman (1984) show that (under some mild omitted conditions), \[ \left\{x_{1}, \ldots, x_{n}\right\} \subseteq \mathbb{R}^{d} \] is a data set (possibly deterministic with no assumption on generating process), \(\theta\) is a uniform random point in the sphere \(\mathbb{S}^{d-1},\) and \[ \mu_{x}^{\theta}:=\frac{1}{n} \sum_{i=1}^{n} \delta_{\left\langle x_{i}, \theta\right\rangle} \] is the empirical measure of the projection of the \(x_{i}\) onto \(\theta\), then as \(n, d \rightarrow \infty,\) the measures \(\mu_{x}^{\theta}\) tend to \(\mathcal{N}\left(0, \sigma^{2}\right)\) weakly in probability. This succinct statement is modeled on Elizabeth Meckes'.

A lesson here is that even non-Gaussian, non-independent data can become nearly i.i.d. Gaussian in low dimensional projection, as Dasgupta, Hsu, and Verma (2006) argue in their introduction.

This has been taken to incredible depth in the work of Elizabeth Meckes 1980—2020 whose papers serve as the canonical textbook in the area for now. Two foundational ones are Chatterjee and Meckes (2008) and E. Meckes (2009) and there is a kind of user guide in E. Meckes (2012b) which leverages Stein’s method a whole bunch.

Random projections are distance preserving

What makes random embeddings go. The most famous result here is the Johnson-Lindenstrauss lemma.

A simple proof of that is given by Dasgupta and Gupta (2003)

Projection statistics

Another key phrase we can look for is probability on the Stiefel manifold, which is a generalization of a familiar concept from random orthonormal matrices. Stiefel manifolds generalise an orthonormal matrix because they can map between spaces of different dimension. Formally, the Stiefel manifold \(V_{k, m}\) is the space of \(k\) frames in the \(m\) -dimensional real Euclidean space \(R^{m},\) represented by the set of \(m \times k\) matrices \(X\) such that \(X^{\prime} X=I_{k},\) where \(I_{k}\) is the \(k \times k\) identity matrix. There are some interesting cases in low dimensional projections served by \(k\ll m\) especially \(k=1.\)

Cool results in this domain are, e.g. Chikuse and 筑瀬 (2003); E. S. Meckes and Meckes (2013); E. Meckes (2012a); Stam (1982).

General projections results are in Dümbgen and Del Conte-Zerial (2013).

An important trick here is the distribution of isotropic unit vectors.

Let \(Z=Z^{(q)}:=\left(Z_{1}, Z_{2}, \ldots, Z_{d}\right)\) be a random matrix in \(\mathbb{R}^{m \times k}\) with independent, standard Gaussian column vectors \(Z_{j} \in \mathbb{R}^{m} .\) Then \[ \Theta:=Z\left(Z^{\top} Z\right)^{-1 / 2}=Z/\|Z\|_2^2 \] has the desired distribution, and \[ \Theta=m^{-1 / 2} Z\left(I+O_{p}\left(m^{-1 / 2}\right)\right) \quad \text { as } m \rightarrow \infty. \]

Vershynin’s writing on a variety of hard high-dimensional probability results is pretty accessible: Vershynin (2015); Vershynin (2018). These bleed over into concentration results.

I wrote one of my own… TBD.

References

Achlioptas, Dimitris. 2003. “Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins.” Journal of Computer and System Sciences, Special Issue on PODS 2001, 66 (4): 671–87. https://doi.org/10.1016/S0022-0000(03)00025-4.
Anttila, Milla, Keith Ball, and Irini Perissinaki. 2003. “The Central Limit Problem for Convex Bodies.” Transactions of the American Mathematical Society 355 (12): 4723–35. https://doi.org/10.1090/S0002-9947-03-03085-X.
Beckner, William. 1989. “A Generalized Poincare Inequality for Gaussian Measures.” Proceedings of the American Mathematical Society 105 (2): 397. https://doi.org/10.2307/2046956.
Bhattacharya, Abhishek, and Rabi Bhattacharya. 2012. Nonparametric Inference on Manifolds: With Applications to Shape Spaces. Institute of Mathematical Statistics Monographs. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139094764.
Bingham, Ella, and Heikki Mannila. 2001. “Random Projection in Dimensionality Reduction: Applications to Image and Text Data.” In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 245–50. KDD ’01. New York, NY, USA: ACM. https://doi.org/10.1145/502512.502546.
Bobkov, S. G. 2003. “On Concentration of Distributions of Random Weighted Sums.” The Annals of Probability 31 (1): 195–215. https://doi.org/10.1214/aop/1046294309.
Bobkov, S. G., and A. Koldobsky. 2003. “On the Central Limit Property of Convex Bodies.” In Geometric Aspects of Functional Analysis: Israel Seminar 2001-2002, edited by Vitali D. Milman and Gideon Schechtman, 44–52. Lecture Notes in Mathematics. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-36428-3_5.
Brehm, Ulrich, and Jurgen Voigt. n.d. “Asymptotics of Cross Sections for Convex Bodiesy,” 18.
Candès, Emmanuel J., and Terence Tao. 2006. “Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?” IEEE Transactions on Information Theory 52 (12): 5406–25. https://doi.org/10.1109/TIT.2006.885507.
Chatterjee, Sourav, and Elizabeth Meckes. 2008. “Multivariate Normal Approximation Using Exchangeable Pairs.” arXiv:math/0701464, January. http://arxiv.org/abs/math/0701464.
Chikuse, Yasuko, and 筑瀬靖子. 2003. Statistics on Special Manifolds. New York, NY: Springer New York. https://link.springer.com/book/10.1007%2F978-0-387-21540-2.
Dasgupta, Sanjoy. 2000. “Experiments with Random Projection.” In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, 143–51. UAI’00. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. http://arxiv.org/abs/1301.3849.
Dasgupta, Sanjoy, and Anupam Gupta. 2003. “An Elementary Proof of a Theorem of Johnson and Lindenstrauss.” Random Structures & Algorithms 22 (1): 60–65. https://doi.org/10.1002/rsa.10073.
Dasgupta, Sanjoy, Daniel Hsu, and Nakul Verma. 2006. “A Concentration Theorem for Projections.” In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, 114–21. UAI’06. Arlington, Virginia, USA: AUAI Press. http://arxiv.org/abs/1206.6813.
Diaconis, Persi, and David Freedman. 1984. “Asymptotics of Graphical Projection Pursuit.” The Annals of Statistics 12 (3): 793–815. http://www.jstor.org/stable/2240961.
Dümbgen, Lutz, and Perla Del Conte-Zerial. 2013. “On Low-Dimensional Projections of High-Dimensional Distributions.” From Probability to Statistics and Back: High-Dimensional Models and Processes – A Festschrift in Honor of Jon A. Wellner, January, 91–104. https://doi.org/10.1214/12-IMSCOLL908.
Eldan, R., and B. Klartag. 2008. “Pointwise Estimates for Marginals of Convex Bodies.” Journal of Functional Analysis 254 (8): 2275–93. https://doi.org/10.1016/j.jfa.2007.08.014.
Eldan, Ronen, and Bo’az Klartag. 2010. “Approximately Gaussian Marginals and the Hyperplane Conjecture.” arXiv:1001.0875 [math], January. http://arxiv.org/abs/1001.0875.
Freund, Yoav, Sanjoy Dasgupta, Mayank Kabra, and Nakul Verma. 2007. “Learning the Structure of Manifolds Using Random Projections.” In Advances in Neural Information Processing Systems, 473–80. http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2007_133.pdf.
Gantert, Nina, Steven Soojin Kim, and Kavita Ramanan. 2017. “Large Deviations for Random Projections of \(\ell^{p}\) Balls.” The Annals of Probability 45 (6B): 4419–76. https://doi.org/10.1214/16-AOP1169.
Guédon, Olivier. 2014. “Concentration Phenomena in High Dimensional Geometry.” Edited by Arnaud Guillin. ESAIM: Proceedings 44 (January): 47–60. https://doi.org/10.1051/proc/201444002.
Hall, Peter, and Ker-Chau Li. 1993. “On Almost Linearity of Low Dimensional Projections from High Dimensional Data.” The Annals of Statistics 21 (2): 867–89. https://doi.org/10.1214/aos/1176349155.
Har-Peled, Sariel, Piotr Indyk, and Rajeev Motwani. 2012. “Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality.” Theory of Computing 8 (1): 321–50. https://doi.org/10.4086/toc.2012.v008a014.
Houdré, Christian, Michel Ledoux, and Emanuel Milman. 2011. Concentration, Functional Inequalities and Isoperimetry: International Workshop on Concentration, Functional Inequalities, and Isoperimetry, October 29-November 1, 2009, Florida Atlantic University, Boca Raton, Florida. American Mathematical Soc.
Indyk, Piotr, and Rajeev Motwani. 1998. “Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality.” In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, 604–13. STOC ’98. New York, NY, USA: ACM. https://doi.org/10.1145/276698.276876.
Jiang, Haotian, Yin Tat Lee, and Santosh S. Vempala. 2020. “A Generalized Central Limit Conjecture for Convex Bodies.” In Geometric Aspects of Functional Analysis: Israel Seminar (GAFA) 2017-2019 Volume II, edited by Bo’az Klartag and Emanuel Milman, 1–41. Lecture Notes in Mathematics. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-46762-3_1.
Joarder, Anwarul H., and Mir M. Ali. 1996. “On the Characterization of Spherical Distributions.” Journal of Information and Optimization Sciences 17 (1): 177–84. https://doi.org/10.1080/02522667.1996.10699271.
Kim, Steven Soojin, Yin-Ting Liao, and Kavita Ramanan. 2020. “An Asymptotic Thin Shell Condition and Large Deviations for Random Multidimensional Projections.” arXiv:1912.13447 [math], June. http://arxiv.org/abs/1912.13447.
Klartag, B. 2007a. “A Central Limit Theorem for Convex Sets.” Inventiones Mathematicae 168 (1): 91–131. https://doi.org/10.1007/s00222-006-0028-8.
———. 2007b. “Power-Law Estimates for the Central Limit Theorem for Convex Sets.” Journal of Functional Analysis 245 (1): 284–310. https://doi.org/10.1016/j.jfa.2006.12.005.
Lahiri, Subhaneil, Peiran Gao, and Surya Ganguli. 2016. “Random Projections of Random Manifolds.” arXiv:1607.04331 [cs, q-Bio, Stat], July. http://arxiv.org/abs/1607.04331.
Li, Ping, Trevor J. Hastie, and Kenneth W. Church. 2006. “Very Sparse Random Projections.” In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 287–96. KDD ’06. New York, NY, USA: ACM. https://doi.org/10.1145/1150402.1150436.
Meckes, Elizabeth. 2006. “An Infinitesimal Version of Stein’s Method of Exchangeable Pairs.”
———. 2009. “On Stein’s Method for Multivariate Normal Approximation.” In High Dimensional Probability V: The Luminy Volume, 153–78. Beachwood, Ohio, USA: Institute of Mathematical Statistics. https://doi.org/10.1214/09-IMSCOLL511.
———. 2012a. “Projections of Probability Distributions: A Measure-Theoretic Dvoretzky Theorem.” In Geometric Aspects of Functional Analysis: Israel Seminar 2006–2010, edited by Bo’az Klartag, Shahar Mendelson, and Vitali D. Milman, 317–26. Lecture Notes in Mathematics. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-29849-3_18.
———. 2012b. “Approximation of Projections of Random Vectors.” Journal of Theoretical Probability 25 (2): 333–52. https://doi.org/10.1007/s10959-010-0299-2.
Meckes, Elizabeth S., and Mark W. Meckes. 2013. “Concentration and Convergence Rates for Spectral Measures of Random Matrices.” Probability Theory and Related Fields 156 (1-2): 145–64. https://doi.org/10.1007/s00440-012-0423-6.
Meckes, Mark W. 2008. “Gaussian Marginals of Convex Bodies with Symmetries.” arXiv:math/0606073, June. http://arxiv.org/abs/math/0606073.
Paouris, G. 2006. “Concentration of Mass on Convex Bodies” 16: 29.
Peña, Victor H., Tze Leung Lai, and Qi-Man Shao. 2008. Self-Normalized Processes: Limit Theory and Statistical Applications. Springer Science & Business Media.
Reeves, G. 2017. “Conditional Central Limit Theorems for Gaussian Projections.” In 2017 IEEE International Symposium on Information Theory (ISIT), 3045–49. https://doi.org/10.1109/ISIT.2017.8007089.
Reinert, Gesine, and Adrian Röllin. 2007. “Multivariate Normal Approximation with Stein’s Method of Exchangeable Pairs Under a General Linearity Condition,” November. https://doi.org/10.1214/09-AOP467.
Rudelson, Mark, and Roman Vershynin. 2013. “Hanson-Wright Inequality and Sub-Gaussian Concentration.” Electronic Communications in Probability 18 (none): 1–9. https://doi.org/10.1214/ECP.v18-2865.
Sodin, S. 2007. “Tail-Sensitive Gaussian Asymptotics for Marginals of Concentrated Measures in High Dimension.” In Geometric Aspects of Functional Analysis, edited by Vitali D. Milman and Gideon Schechtman, 1910:271–95. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-72053-9_16.
Stam, A. J. 1982. “Limit Theorems for Uniform Distributions on Spheres in High-Dimensional Euclidean Spaces.” Journal of Applied Probability 19 (1): 221–28. https://doi.org/10.2307/3213932.
Stein, Charles. 1972. “A Bound for the Error in the Normal Approximation to the Distribution of a Sum of Dependent Random Variables.” Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, January, 583–602. https://projecteuclid.org/ebooks/berkeley-symposium-on-mathematical-statistics-and-probability/Proceedings-of-the-Sixth-Berkeley-Symposium-on-Mathematical-Statistics-and/chapter/A-bound-for-the-error-in-the-normal-approximation-to/bsmsp/1200514239.
Vershynin, Roman. 2011. “Introduction to the Non-Asymptotic Analysis of Random Matrices.” arXiv:1011.3027 [cs, Math], November. http://arxiv.org/abs/1011.3027.
———. 2015. “Estimation in High Dimensions: A Geometric Perspective.” In Sampling Theory, a Renaissance: Compressive Sensing and Other Developments, edited by Götz E. Pfander, 3–66. Applied and Numerical Harmonic Analysis. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-19749-4_1.
———. 2018. High-Dimensional Probability: An Introduction with Applications in Data Science. 1st ed. Cambridge University Press. https://doi.org/10.1017/9781108231596.
Weizsäcker, Heinrich von. 1997. “Sudakov’s Typical Marginals, Random Linear Functionals and a Conditional Central Limit Theorem.” Probability Theory and Related Fields 107 (3): 313–24. https://doi.org/10.1007/s004400050087.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.