Bootstrap

Shuffling reality to produce your data


Resampling your own data to estimate how good your point-estimator is, and to reduce its bias. In general an intuitive technique. However, gets tricky for e.g. dependent data. For a handy crib sheet for bootstrap failure modes, see Thomas Lumley, When the bootstrap doesn’t work.

In the classical mode, this is a frequentist technique without an immediate Bayesian interpretation.

Commonly credited as being invented by B. Efron (1979) and theoretically justified by Gine and Zinn (1990).

Bootstrap bias correction

As opp variance estimation. NBD; Bootstrap is notionally telling you the sampling distribution. 🏗

Bootstrap for dependent data

e.g., as presaged, time series. Parametric bootstrap would be the logical default choice, right?

As a Bayesian method

There is absolutely a Bayesian bootstrap if you think hard enough about it, it turns out. Several, really. Rubin (1981) derived a Bayesian version. See Lyddon, Holmes, and Walker (2019) for a modern update, and Rasmus Bååth for a diagrammed explanation of the points of contact with frequentist bootstrap and some other things.

Pedagogic

References

Alfaro, M. E. 2003. “Bayes or Bootstrap? A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping in Assessing Phylogenetic Confidence.” Molecular Biology and Evolution 20 (2): 255–66. https://doi.org/10.1093/molbev/msg028.
Bach, Francis. 2009. “Model-Consistent Sparse Estimation Through the Bootstrap.” 2009. https://hal.archives-ouvertes.fr/hal-00354771/document.
Biewen, Martin. 2002. “Bootstrap Inference for Inequality, Mobility and Poverty Measurement.” Journal of Econometrics 108 (2): 317–42. https://doi.org/10.1016/S0304-4076(01)00138-5.
Burnham, Kenneth P., and David R. Anderson. 2004. “Multimodel Inference Understanding AIC and BIC in Model Selection.” Sociological Methods & Research 33 (2): 261–304. https://doi.org/10.1177/0049124104268644.
Bühlmann, Peter. 2002. “Bootstraps for Time Series.” Statistical Science 17 (1): 52–72. ftp://stat.ethz.ch/Research-Reports/87.pdf.
Bühlmann, Peter, and Hans R Künsch. 1999. “Block Length Selection in the Bootstrap for Time Series.” Computational Statistics & Data Analysis 31 (3): 295–310. https://doi.org/10.1016/S0167-9473(99)00014-6.
Chang, Jinyuan, and Peter Hall. 2015. “Double-Bootstrap Methods That Use a Single Double-Bootstrap Simulation.” Biometrika 102 (1): 203–14. https://doi.org/10.1093/biomet/asu060.
Chen, Kani, and Shaw-Hwa Lo. 1997. “On a Mapping Approach to Investigating the Bootstrap Accuracy.” Probability Theory and Related Fields 107 (2): 197–217. https://doi.org/10.1007/s004400050083.
Cogneau, Philippe, and Valeri Zakamouline. 2010. “Bootstrap Methods for Finance: Review and Analysis.” Working Paper, University of Agder. http://www.seminar.hec.ulg.ac.be/docs/Sem21.10.10_Cogneau.pdf.
Dahlhaus, Rainer. 2011. “Discussion: Bootstrap Methods for Dependent Data: A Review.” Journal of the Korean Statistical Society 40 (4): 379–81. https://doi.org/10.1016/j.jkss.2011.07.004.
DiCiccio, Thomas J., and Bradley Efron. 1996. “Bootstrap Confidence Intervals.” Statistical Science 11 (3, 3): 189–212. https://doi.org/10.1214/ss/1032280214.
Efron, B. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics 7 (1, 1): 1–26. https://doi.org/10.1214/aos/1176344552.
Efron, Bradley. 1981. “Nonparametric Estimates of Standard Error: The Jackknife, the Bootstrap and Other Methods.” Biometrika 68 (3): 589–99. https://doi.org/10.1093/biomet/68.3.589.
———. 2012. “Bayesian Inference and the Parametric Bootstrap.” The Annals of Applied Statistics 6 (4): 1971–97. https://doi.org/10.1214/12-AOAS571.
Fong, Edwin, and Chris Holmes. 2019. “On the Marginal Likelihood and Cross-Validation.” Oxford Academic. May 21, 2019. http://arxiv.org/abs/1905.08737.
Gine, Evarist, and Joel Zinn. 1990. “Bootstrapping General Empirical Measures.” Annals of Probability 18 (2): 851–69. https://doi.org/10.1214/aop/1176990862.
Giordano, Ryan, Michael I. Jordan, and Tamara Broderick. 2019. “A Higher-Order Swiss Army Infinitesimal Jackknife.” July 28, 2019. http://arxiv.org/abs/1907.12116.
Gonçalves, Sílvia, and Dimitris Politis. 2011. “Discussion: Bootstrap Methods for Dependent Data: A Review.” Journal of the Korean Statistical Society 40 (4): 383–86. https://doi.org/10.1016/j.jkss.2011.07.003.
Gonçalves, Sílvia, and Halbert White. 2004. “Maximum Likelihood and the Bootstrap for Nonlinear Dynamic Models.” Journal of Econometrics 119 (1): 199–219. https://doi.org/10.1016/S0304-4076(03)00204-5.
Good, Phillip I., and Philip Good. 1999. Resampling Methods: A Practical Guide to Data Analysis. Birkhäuser Basel. https://doi.org/10.1007/978-1-4757-3049-4.
Götze, F., and H. R. Künsch. 1996. “Second-Order Correctness of the Blockwise Bootstrap for Stationary Observations.” The Annals of Statistics 24 (5): 1914–33. https://doi.org/10.1214/aos/1069362303.
Green, Alden, and Cosma Rohilla Shalizi. 2017. “Bootstrapping Exchangeable Random Graphs.” November 2, 2017. http://arxiv.org/abs/1711.00813.
Hall, Peter. 1992. “On Bootstrap Confidence Intervals in Nonparametric Regression.” The Annals of Statistics 20 (2): 695–711. http://www.jstor.org/stable/2241979.
———. 1994. “Methodology and Theory for the Bootstrap.” In Handbook of Econometrics, 4:2341–81. Elsevier. http://www.sciencedirect.com/science/article/pii/S157344120580008X.
Hall, Peter, Joel L. Horowitz, and Bing-Yi Jing. 1995. “On Blocking Rules for the Bootstrap with Dependent Data.” Biometrika 82 (3): 561–74. https://doi.org/10.1093/biomet/82.3.561.
Härdle, Wolfgang, Joel Horowitz, and Jens-Peter Kreiss. 2003. “Bootstrap Methods for Time Series.” International Statistical Review 71 (2): 435–59. http://onlinelibrary.wiley.com/doi/10.1111/j.1751-5823.2003.tb00485.x/abstract.
Hesterberg, Tim. 2011. “Bootstrap.” Wiley Interdisciplinary Reviews: Computational Statistics 3 (6): 497–526. https://doi.org/10.1002/wics.182.
Hinkley, David V. 1997. Bootstrap Methods and Their Application. Cambridge ; New York, NY, USA: Cambridge University Press. http://www.loc.gov/catdir/description/cam027/96030064.html.
Künsch, Hans Rudolf. 1989. “The Jackknife and the Bootstrap for General Stationary Observations.” The Annals of Statistics 17 (3): 1217–41. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.924&rep=rep1&type=pdf.
Lahiri, S N. 1993. “On the Moving Block Bootstrap Under Long Range Dependence.” Statistics & Probability Letters 18 (5): 405–13. https://doi.org/10.1016/0167-7152(93)90035-H.
———. 2001. “Effects of Block Lengths on the Validity of Block Resampling Methods.” Probability Theory and Related Fields 121: 73–97. https://doi.org/10.1007/PL00008798.
———. 2003. Resampling Methods for Dependent Data. New York: Springer.
Lee, Stephen M. S., and G. Alastair Young. 1996. “[Bootstrap Confidence Intervals]: Comment.” Statistical Science 11 (3): 221–23. http://www.jstor.org/stable/2246114.
Lyddon, S. P., C. C. Holmes, and S. G. Walker. 2019. “General Bayesian Updating and the Loss-Likelihood Bootstrap.” Biometrika 106 (2): 465–78. https://doi.org/10.1093/biomet/asz006.
Paparoditis, Efstathios, and Theofanis Sapatinas. 2014. “Bootstrap-Based Testing for Functional Data.” September 15, 2014. http://arxiv.org/abs/1409.4317.
Politis, Dimitris N. 2003. “The Impact of Bootstrap Methods on Time Series Analysis.” Statistical Science 18 (2): 219–30. https://doi.org/10.1214/ss/1063994977.
Politis, Dimitris N., and Joseph P. Romano. 1994. “The Stationary Bootstrap.” Journal of the American Statistical Association 89 (428): 1303–13. https://doi.org/10.1080/01621459.1994.10476870.
Politis, Dimitris N., and Halbert White. 2004. “Automatic Block-Length Selection for the Dependent Bootstrap.” Econometric Reviews 23 (1): 53–70. https://doi.org/10.1081/ETC-120028836.
Rodriguez, Alejandro, and Esther Ruiz. 2009. “Bootstrap Prediction Intervals in State–Space Models.” Journal of Time Series Analysis 30 (2): 167–78. https://doi.org/10.1111/j.1467-9892.2008.00604.x.
Rubin, Donald B. 1981. “The Bayesian Bootstrap.” Annals of Statistics 9 (1): 130–34. https://doi.org/10.1214/aos/1176345338.
Shalizi, Cosma Rohilla. 2010. “The Bootstrap.” American Scientist 98 (3): 186. https://doi.org/10.1511/2010.84.186.
Shao, Jun. 1996. “Bootstrap Model Selection.” Journal of the American Statistical Association 91 (434): 655–65. https://doi.org/10.2307/2291661.
Shibata, Ritei. 1997. “Bootstrap Estimate of Kullback-Leibler Information for Model Selection.” Statistica Sinica 7: 375–94.
Stone, M. 1977. “An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion.” Journal of the Royal Statistical Society. Series B (Methodological) 39 (1): 44–47. https://doi.org/10.1111/j.2517-6161.1977.tb01603.x.
Tibshirani, Ryan J., Alessandro Rinaldo, Robert Tibshirani, and Larry Wasserman. 2015. “Uniform Asymptotic Inference and the Bootstrap After Model Selection.” June 20, 2015. http://arxiv.org/abs/1506.06266.
Vogel, Richard M., and Amy L. Shallcross. 1996. “The Moving Blocks Bootstrap Versus Parametric Time Series Models.” Water Resources Research 32 (6): 1875–82. https://doi.org/10.1029/96WR00928.
Yatchew, A, and W Hardle. 2006. “Nonparametric State Price Density Estimation Using Constrained Least Squares and the Bootstrap.” Journal of Econometrics 133 (2): 579–99.