Cross Validation

On substituting simulation for analysis in model selection, in e.g. choosing the “right” regularisation parameter for sparse regression.

Asymptotically equivalent to generalised Akaike information criteria. (e.g. Ston77) Related to bootstrap in various ways.

The computationally expensive default option when your model doesn’t have any obvious short cuts for complexity regularization. For example, I think that the AIC for penalised reqgression requires penalties twice-differentiable at the optimum. I’m not sure they couldn’t be made to work, however. should investigate.

Present alternatives, especially outside n-fold cross-validation, especially computationally tractable ones. Methods based on statistical learning theory or concentration inequalities win gratitude.

🏗

Generalised Cross Validation

Why the name? It’s specialised cross-validation, AFAICT.

🏗 Hat matrix, smoother matrix. Note comparative computational efficiency. Define hat matrix.

Andrews, Donald W. K. 1991. “Asymptotic Optimality of Generalized CL, Cross-Validation, and Generalized Cross-Validation in Regression with Heteroskedastic Errors.” Journal of Econometrics 47 (2): 359–77. https://doi.org/10.1016/0304-4076(91)90107-O.

Giordano, Ryan, Michael I. Jordan, and Tamara Broderick. 2019. “A Higher-Order Swiss Army Infinitesimal Jackknife,” July. http://arxiv.org/abs/1907.12116.

Giordano, Ryan, Will Stephenson, Runjing Liu, Michael I. Jordan, and Tamara Broderick. 2019. “A Swiss Army Infinitesimal Jackknife.” In AISTATS. http://arxiv.org/abs/1806.00550.

Golub, Gene H., Michael Heath, and Grace Wahba. 1979. “Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics 21 (2): 215–23. https://doi.org/10.1080/00401706.1979.10489751.

Hall, Peter, Jeff Racine, and Qi Li. 2004. “Cross-Validation and the Estimation of Conditional Probability Densities.” Journal of the American Statistical Association 99 (468): 1015–26. https://doi.org/10.1198/016214504000000548.

Laan, Mark J. van der, Eric C Polley, and Alan E. Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1). https://doi.org/10.2202/1544-6115.1309.

Li, Ker-Chau. 1987. “Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set.” The Annals of Statistics 15 (3): 958–75. https://doi.org/10.1214/aos/1176350486.

Polley, Eric, and Mark van der Laan. 2010. “Super Learner in Prediction.” U.C. Berkeley Division of Biostatistics Working Paper Series, May. https://biostats.bepress.com/ucbbiostat/paper266.

Stone, M. 1977. “An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion.” Journal of the Royal Statistical Society. Series B (Methodological) 39 (1): 44–47. http://www.stat.washington.edu/courses/stat527/s14/readings/Stone1977.pdf.

Wood, S. 1994. “Monotonic Smoothing Splines Fitted by Cross Validation.” SIAM Journal on Scientific Computing 15 (5): 1126–33. https://doi.org/10.1137/0915069.