Bootstrap

Shuffling reality to produce your data

November 26, 2014 — January 27, 2022

estimator distribution
nonparametric
probabilistic algorithms
statistics
uncertainty
Figure 1

Resampling your own data to estimate how good your point estimator is and to reduce its bias. In general, it’s an intuitive technique. However, it gets tricky for, e.g., dependent data. For a handy crib sheet for bootstrap failure modes, see Thomas Lumley, When the bootstrap doesn’t work.

In the classical mode, this is a frequentist technique without an immediate Bayesian interpretation.

Commonly credited to B. Efron (1979) and theoretically justified by Gine and Zinn (1990).

1 Teaching

Datacamp bootstrap tutorial.

2 Bootstrap bias correction

As opposed to variance estimation. NBD; Bootstrap is notionally telling you the sampling distribution. 🏗

3 Bootstrap for dependent data

e.g., as presaged, time series. Parametric bootstrap would be the logical default choice, right? When does that work?

4 Causal bootstrap

Now a thing! (Imbens and Menzel 2021)

5 As a Bayesian method

There is absolutely a Bayesian bootstrap if you think hard enough about it, it turns out. Several, really. Rubin (1981) derived a Bayesian version. See Lyddon, Holmes, and Walker (2019) for a modern update, and Rasmus Bååth for a diagrammed explanation of the points of contact with frequentist bootstrap and some other things.

6 Pedagogic

7 References

Alfaro. 2003. Bayes or Bootstrap? A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping in Assessing Phylogenetic Confidence.” Molecular Biology and Evolution.
Bach. 2009. Model-Consistent Sparse Estimation Through the Bootstrap.” arXiv:0901.3202 [Cs, Stat].
Barber, Candès, Ramdas, et al. 2021. Predictive Inference with the Jackknife+.” The Annals of Statistics.
Biewen. 2002. Bootstrap Inference for Inequality, Mobility and Poverty Measurement.” Journal of Econometrics.
Bühlmann. 2002. Bootstraps for Time Series.” Statistical Science.
Bühlmann, and Künsch. 1999. Block Length Selection in the Bootstrap for Time Series.” Computational Statistics & Data Analysis.
Burnham, and Anderson. 2004. Multimodel Inference Understanding AIC and BIC in Model Selection.” Sociological Methods & Research.
Chang, and Hall. 2015. Double-Bootstrap Methods That Use a Single Double-Bootstrap Simulation.” Biometrika.
Chen, and Lo. 1997. On a Mapping Approach to Investigating the Bootstrap Accuracy.” Probability Theory and Related Fields.
Cogneau, and Zakamouline. 2010. Bootstrap Methods for Finance: Review and Analysis.”
Dahlhaus. 2011. Discussion: Bootstrap Methods for Dependent Data: A Review.” Journal of the Korean Statistical Society.
DiCiccio, and Efron. 1996a. [Bootstrap Confidence Intervals]: Rejoinder.” Statistical Science.
———. 1996b. Bootstrap Confidence Intervals.” Statistical Science.
Efron, B. 1979. Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics.
Efron, Bradley. 1981. Nonparametric Estimates of Standard Error: The Jackknife, the Bootstrap and Other Methods.” Biometrika.
———. 2012. Bayesian Inference and the Parametric Bootstrap.” The Annals of Applied Statistics.
———. 2021. Resampling Plans and the Estimation of Prediction Error.” Stats.
Fong, and Holmes. 2020. On the Marginal Likelihood and Cross-Validation.” Biometrika.
Galvani, Bardelli, Figini, et al. 2021. A Bayesian Nonparametric Learning Approach to Ensemble Models Using the Proper Bayesian Bootstrap.” Algorithms.
Gine, and Zinn. 1990. Bootstrapping General Empirical Measures.” Annals of Probability.
Giordano, Jordan, and Broderick. 2019. A Higher-Order Swiss Army Infinitesimal Jackknife.” arXiv:1907.12116 [Cs, Math, Stat].
Gonçalves, and Politis. 2011. Discussion: Bootstrap Methods for Dependent Data: A Review.” Journal of the Korean Statistical Society.
Gonçalves, and White. 2004. Maximum Likelihood and the Bootstrap for Nonlinear Dynamic Models.” Journal of Econometrics.
Good, and Good. 1999. Resampling Methods: A Practical Guide to Data Analysis.
Götze, and Künsch. 1996. Second-Order Correctness of the Blockwise Bootstrap for Stationary Observations.” The Annals of Statistics.
Green, and Shalizi. 2017. Bootstrapping Exchangeable Random Graphs.” arXiv:1711.00813 [Stat].
Hall. 1992. On Bootstrap Confidence Intervals in Nonparametric Regression.” The Annals of Statistics.
———. 1994. Methodology and Theory for the Bootstrap.” In Handbook of Econometrics.
Hall, Horowitz, and Jing. 1995. On Blocking Rules for the Bootstrap with Dependent Data.” Biometrika.
Härdle, Horowitz, and Kreiss. 2003. Bootstrap Methods for Time Series.” International Statistical Review.
Hesterberg. 2011. Bootstrap.” Wiley Interdisciplinary Reviews: Computational Statistics.
Hinkley. 1997. Bootstrap Methods and Their Application.
Imbens, and Menzel. 2021. A Causal Bootstrap.” The Annals of Statistics.
Künsch. 1989. The Jackknife and the Bootstrap for General Stationary Observations.” The Annals of Statistics.
Lahiri. 1993. On the Moving Block Bootstrap Under Long Range Dependence.” Statistics & Probability Letters.
———. 2001. Effects of Block Lengths on the Validity of Block Resampling Methods.” Probability Theory and Related Fields.
———. 2003. Resampling Methods for Dependent Data.
Lee, and Young. 1996. [Bootstrap Confidence Intervals]: Comment.” Statistical Science.
Lyddon, Holmes, and Walker. 2019. General Bayesian Updating and the Loss-Likelihood Bootstrap.” Biometrika.
Papadopoulos, Edwards, and Murray. 2001. Confidence Estimation Methods for Neural Networks: A Practical Comparison.” IEEE Transactions on Neural Networks.
Paparoditis, and Sapatinas. 2014. Bootstrap-Based Testing for Functional Data.” arXiv:1409.4317 [Math, Stat].
Politis. 2003. The Impact of Bootstrap Methods on Time Series Analysis.” Statistical Science.
Politis, and Romano. 1994. The Stationary Bootstrap.” Journal of the American Statistical Association.
Politis, and White. 2004. Automatic Block-Length Selection for the Dependent Bootstrap.” Econometric Reviews.
Rodriguez, and Ruiz. 2009. Bootstrap Prediction Intervals in State–Space Models.” Journal of Time Series Analysis.
Rubin. 1981. The Bayesian Bootstrap.” Annals of Statistics.
Sanson, Strange, and Garry. 2019. Trigger Warnings Are Trivially Helpful at Reducing Negative Affect, Intrusive Thoughts, and Avoidance.” Clinical Psychological Science.
Shalizi. 2010. The Bootstrap.” American Scientist.
Shao. 1996. Bootstrap Model Selection.” Journal of the American Statistical Association.
Shibata. 1997. “Bootstrap Estimate of Kullback-Leibler Information for Model Selection.” Statistica Sinica.
Stone. 1977. An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion.” Journal of the Royal Statistical Society. Series B (Methodological).
Tibshirani, Rinaldo, Tibshirani, et al. 2015. Uniform Asymptotic Inference and the Bootstrap After Model Selection.” arXiv:1506.06266 [Math, Stat].
Vogel, and Shallcross. 1996. The Moving Blocks Bootstrap Versus Parametric Time Series Models.” Water Resources Research.
Yatchew, and Hardle. 2006. “Nonparametric State Price Density Estimation Using Constrained Least Squares and the Bootstrap.” Journal of Econometrics.