Placeholder to think about the many weird problems arising in very high dimensional statistical inference. There are many approaches to this problem: throwing out dimensions/predictors as in model selection, considering low dimensional projections, viewing objects with matrix structure for concentration or factorisation, or tensor structure even.

## Soap bubbles

High dimensional distributions are extremely odd, and concentrate in weird ways.
For example, for some natural definitions of *typical*, typical items are *not* average items in high
See Sander Dielemann’s musings on typicality for an introduction to this plus some motivating examples.

For another example, consider this summary result of Vershynin (2015):

Let \(K\) be an isotropic convex body (e.g. an \(L_2\) ball) in \(\mathbb{R}^{n},\) and let \(X\) be a random vector uniformly distributed in \(K\), with \(\mathbb{E}X=0\) and \(\mathbb{E}XX^{\top}=I_n.\) Then the following is true for some positive constants \(C,c\):

- (Concentration of volume) For every \(t \geq 1\), one has \[ \mathbb{P}\left\{\|X\|_{2}>t \sqrt{n}\right\} \leq \exp (-c t \sqrt{n}) \]
- (Thin shell) For every \(\varepsilon \in(0,1),\) one has \[ \mathbb{P}\left\{\left|\|X\|_{2}-\sqrt{n}\right|>\varepsilon \sqrt{n}\right\} \leq C \exp \left(-c \varepsilon^{3} n^{1 / 2}\right) \]

That is, even with the mass *uniformly distributed* over space, as the dimension grows, it all ends up in a thin shell, because volume grows exponentially in dimension.
This is popularly known as a soap bubble phenomenon.
This is one of the phenomena that leads to interesting behaviour in low dimensional projection.
The more formal name is the *Gaussian Annulus Theorem*.
Turning it around, for a d-dimensional spherical Gaussian with unit variance in each direction, for any \(\beta \leq \sqrt{d}\), all but at most \(3 e^{-c \beta^{2}}\) of the probability mass lies within the annulus \(\sqrt{d}-\beta \leq|\mathbf{x}| \leq \sqrt{d}+\beta,\) where \(c\) is a fixed positive constant.

## Convex hulls

Balestriero, Pesenti, and LeCun (2021) cite Bárány and Füredi (1988):

Given a \(d\)-dimensional dataset \(\boldsymbol{X} \triangleq\left\{\boldsymbol{x}_{1}, \ldots, \boldsymbol{x}_{N}\right\}\) with i.i.d. samples \(\boldsymbol{x}_{n} \sim \mathcal{N}\left(0, I_{d}\right), \forall n\), the probability that a new sample \(\boldsymbol{x} \sim \mathcal{N}\left(0, I_{d}\right)\) is in interpolation regime (recall Def. 1 ) has the following limiting behavior \[ \lim _{d \rightarrow \infty} p(\underbrace{\boldsymbol{x} \in \operatorname{Hull}(\boldsymbol{X})}_{\text {interpolation }})= \begin{cases}1 & \Longleftrightarrow N>d^{-1} 2^{d / 2} \\ 0 & \Longleftrightarrow N<d^{-1} 2^{d / 2}\end{cases} \]

They observe that this implies high dimensional statistics rarely interpolates between data points, which is not surprising, but only in retrospect.
Despite some expertise in high-dimensional problems I had never noticed this fact myself.
Interestingly they collect evidence that suggests that low-d projections and latent spaces are *also* rarely interpolating.

## Empirical processes in high dimensions

Combining empirical process theory with high dimensional statistics gets us to some interesting models. See, e.g. van de Geer (2014b).

## Markov Chain Monte Carlo in high dimensions

TBD

## References

*arXiv:2110.09485 [Cs]*, October.

*Probability Theory and Related Fields*77 (2): 231–40.

*arXiv:1401.2906 [Math]*, January.

*Statistics for High-Dimensional Data: Methods, Theory and Applications*. 2011 edition. Heidelberg ; New York: Springer.

*arXiv:1503.06426 [Stat]*9 (1): 1449–73.

*IEEE Transactions on Information Theory*52 (2): 489–509.

*arXiv:1608.00060 [Econ, Stat]*, July.

*arXiv:1812.08089 [Math, Stat]*, December.

*arXiv:1809.05224 [Econ, Math, Stat]*, September.

*arXiv:1403.7023 [Math, Stat]*. Vol. 131.

*arXiv:1409.8557 [Math, Stat]*, September.

*The Annals of Statistics*42 (3): 1166–1202.

*arXiv:1610.00494 [Cs, Stat]*, October.

*arXiv:1706.07180 [Cs, Math, Stat]*, June.

*Bioinformatics*21 (13): 3001–8.

*The Annals of Statistics*21 (2): 867–89.

*Journal of Machine Learning Research*15 (1): 2869–909.

*TEST*, April.

*arXiv:2107.10885 [Math, Stat]*, July.

*arXiv:1504.06706 [Math, Stat]*, April.

*arXiv:1512.03099 [Cs, Math, Stat]*, December.

*Sampling Theory, a Renaissance: Compressive Sensing and Other Developments*, edited by Götz E. Pfander, 3–66. Applied and Numerical Harmonic Analysis. Cham: Springer International Publishing.

*High-Dimensional Probability: An Introduction with Applications in Data Science*. 1st ed. Cambridge University Press.

*Annual Review of Statistics and Its Application*1 (1): 233–53.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*76 (1): 217–42.

## No comments yet. Why not leave one?