Placeholder to think about the many weird problems arising in very high dimensional statistical inference. There are many approaches to this problem: throwing out dimensions/predictors as in model selection, considering low dimensional projections, viewing objects with matrix structure for concentration or factorisation, or tensor structure even.

## Soap bubbles

High dimensional distributions are extremely odd, and concentrate in weird ways.
For example, for some natural definitions of *typical*, typical items are *not* average items in high
See Sander Dielemann’s musings on typicality for an introduction to this plus some motivating examples.

For another example, consider this summary result of Vershynin (2015):

Let \(K\) be an isotropic convex body (e.g. an \(L_2\) ball) in \(\mathbb{R}^{n},\) and let \(X\) be a random vector uniformly distributed in \(K\), with \(\mathbb{E}X=0\) and \(\mathbb{E}XX^{\top}=I_n.\) Then the following is true for some positive constants \(C,c\):

- (Concentration of volume) For every \(t \geq 1\), one has \[ \mathbb{P}\left\{\|X\|_{2}>t \sqrt{n}\right\} \leq \exp (-c t \sqrt{n}) \]
- (Thin shell) For every \(\varepsilon \in(0,1),\) one has \[ \mathbb{P}\left\{\left|\|X\|_{2}-\sqrt{n}\right|>\varepsilon \sqrt{n}\right\} \leq C \exp \left(-c \varepsilon^{3} n^{1 / 2}\right) \]

That is, even with the mass *uniformly distributed* over space, as the dimension grows, it all ends up in a thin shell, because volume grows exponentially in dimension.
This is popularly known as a soap bubble phenomenon.
This is one of the phenomena that leads to interesting behaviour in low dimensional projection.
The more formal name is the *Gaussian Annulus Theorem*.
Turning it around, for a d-dimensional spherical Gaussian with unit variance in each direction, for any \(\beta \leq \sqrt{d}\), all but at most \(3 e^{-c \beta^{2}}\) of the probability mass lies within the annulus \(\sqrt{d}-\beta \leq|\mathbf{x}| \leq \sqrt{d}+\beta,\) where \(c\) is a fixed positive constant.

## Empirical processes in high dimensions

Combining empirical process theory with high dimensional statistics gets us to some extremely interesting statistics. See, e.g. van de Geer (2014b).

## Markov Chain Monte Carlo in high dimensions

TBD

## References

*Statistics for High-Dimensional Data: Methods, Theory and Applications*. 2011 edition. Heidelberg ; New York: Springer.

*IEEE Transactions on Information Theory*52 (2): 489–509. https://doi.org/10.1109/TIT.2005.862083.

*The Annals of Statistics*42 (3): 1166–1202. https://doi.org/10.1214/14-AOS1221.

*Bioinformatics*21 (13): 3001–8. https://doi.org/10.1093/bioinformatics/bti422.

*The Annals of Statistics*21 (2): 867–89. https://doi.org/10.1214/aos/1176349155.

*Journal of Machine Learning Research*15 (1): 2869–909. http://jmlr.org/papers/v15/javanmard14a.html.

*TEST*, April. https://doi.org/10.1007/s11749-015-0441-7.

*Sampling Theory, a Renaissance: Compressive Sensing and Other Developments*, edited by Götz E. Pfander, 3–66. Applied and Numerical Harmonic Analysis. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-19749-4_1.

*High-Dimensional Probability: An Introduction with Applications in Data Science*. 1st ed. Cambridge University Press. https://doi.org/10.1017/9781108231596.

*Annual Review of Statistics and Its Application*1 (1): 233–53. https://doi.org/10.1146/annurev-statistics-022513-115643.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*76 (1): 217–42. https://doi.org/10.1111/rssb.12026.

## No comments yet!