Effective sample size


We have an estimator \(\hat{\theta}}\) of some statistic which is, for the sake of argument, presumed to be the mean calculated from observations of some stochastic process \(\mathsf{v}\). Under certain assumptions we can use central limit theorems to find that the variance of our estimator calculated from \(N\) i.i.d. samples is given by \(\operatorname{Var}(\hat{\theta}})\propto 1/N.\) Effective Sample Size (ESS) gives us a different \(N\), \(N_{\{text{Eff}}\) such that \(\operatorname{Var}(\hat{\theta}})\propto 1/N_{\{text{Eff}}.\)

Statistics

When your experiment design (e.g. because it is a time series, or because of non-random sampling is highly correlated, your data might give you less information than you hope, or expect from the uncorrelated case, with regard to a particular statistics you wish to calculate. In practice all the introductions use the mean.

This is a kind of dual to effective degrees of freedom, which tells you how far your sample size can get you.

Turns out to be important in, e.g. LASSO and, circularly, covariance estimation.

Huber’s (1981) “equivalent number of observations” is probably the same?

Monte Carlo estimation

Related, but a slightly different setup. Not about experimental samples, but number of simulations in simulation-based inference where you are using e.g. importance sampling or a sequential Markov chain sampler. Sebastian Nowozin, Effective Sample Size in Importance Sampling.

[The effective sample size] can be used after or during importance sampling to provide a quantitative measure of the quality of the estimated mean. Even better, the estimate is provided on a natural scale of worth in samples from p, that is, if we use \(n=1000\) samples \(X_i\sim q\) and obtain an ESS of say 350 then this indicates that the quality of our estimate is about the same as if we would have used 350 direct samples.

Since Markov Chain Monte Carlo is so common in statistical inference these days, effective sample size for simulations might be a more common use of effective sample size in statistics than the directly statistical notion of the term.

In practice, we usually cargo cult in the formulae for ESS from a paper on how to do it best. The Stan method for ESS in MCMC, which is a best-practice method AFAICT, is based on the autocorrelogram.

\[ \tau = 1 + 2 \sum_{t=1}^{m} \rho_{t} \] and

\[N_{\text{eff}}=\frac{N}{\tau}.\]

Here \(\rho_{t}\) is the autocorrelation at a given lag. We are implicitly assuming the statistic of interest here is the mean. A short calculation should persuade us that this gives us the convergence rate we expect, but TBH I have not done that here. It seems plausible at least.

In practice you are doing Markov Chain Monte Carlo because the problem is difficult enough that you cannot calculate an effective sample size analytically. So you estimate it, which means, in turn, estimating those autocorrelations. We can use FFT to calculate correlation (Geyer 2011). There are various fiddly details, especially if running multiple Markov chains.

Faes, Christel, Geert Molenberghs, Marc Aerts, Geert Verbeke, and Michael G. Kenward. 2009. “The Effective Sample Size and an Alternative Small-Sample Degrees-of-Freedom Method.” The American Statistician 63 (4): 389–99. https://doi.org/10.1198/tast.2009.08196.

Fang, Youhan, Yudong Cao, and Robert D. Skeel. 2017. “Quasi-Reliable Estimates of Effective Sample Size,” May. http://arxiv.org/abs/1705.03831.

Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3 edition. Chapman & Hall/CRC Texts in Statistical Science. Boca Raton: Chapman and Hall/CRC.

Geyer, Charles J. 2011. “Introduction to Markov Chain Monte Carlo.” In Handbook of Markov Chain Monte Carlo, 20116022:45. http://mcmchandbook.net/HandbookChapter1.pdf.

———. 1992. “Practical Markov Chain Monte Carlo.” Statistical Science 7 (4): 473–83. https://doi.org/10.1214/ss/1177011137.

Kong, Augustine. 1992. “A Note on Importance Sampling Using Standardized Weights.” https://galton.uchicago.edu/techreports/tr348.pdf.

Lenth, Russell V. 2001. “Some Practical Guidelines for Effective Sample Size Determination.” The American Statistician 55 (3): 187–93. https://doi.org/10.1198/000313001317098149.

Liu, Jun S. 1996. “Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling.” Statistics and Computing 6 (2): 113–19. https://doi.org/10.1007/BF00162521.

Thiébaux, H. J., and F. W. Zwiers. 1984. “The Interpretation and Estimation of Effective Sample Size.” Journal of Climate and Applied Meteorology 23 (5): 800–811. http://adsabs.harvard.edu/abs/1984JApMe..23..800T.

Vehtari, Aki, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner. 2020. “Rank-Normalization, Folding, and Localization: An Improved $\widehat{}R{}$ for Assessing Convergence of MCMC,” January. http://arxiv.org/abs/1903.08008.