Cross validation

On substituting simulation for analysis in model selection, in e.g. choosing the β€œright” regularisation parameter for sparse regression.

The computationally expensive default option when your model doesn’t have any obvious short cuts for complexity regularization, for example when AIC cannot be shown to work.

To learn: how this interacts with Bayesian inference.

Basic Cross Validation


Generalised Cross Validation

Why the name? It’s specialised cross-validation, AFAICS. (Andrews 1991; Golub, Heath, and Wahba 1979; Li 1987)

πŸ— Hat matrix, smoother matrix. Note comparative computational efficiency. Define hat matrix.

Bayesian Cross validation


What even is cross validation?

I always thought the answer here was simple: It is asymptotically equivalent to generalised Akaike information criteria. (e.g. Stone (1977)) Related to bootstrap in various ways.

But there is other stuff going on. Here is an interesting sampling of opinions: Rob Tibshirani, Yuling Yao, and Aki Vehtari on cross validation

Testing leakage

The vtreat introduction mentions their why you need hold-out article and also (Perlich and Świrszcz 2011):

Cross-methods such as cross-validation, and cross-prediction are effective tools for many machine learning, statistics, and data science related applications. They are useful for parameter selection, model selection, impact/target encoding of high cardinality variables, stacking models, and super learning. As cross-methods simulate access to an out of sample data set the same the original data, they are more statistically efficient, lower variance, than partitioning training data into calibration/training/holdout sets. However, cross-methods do not satisfy the full exchangeability conditions that full hold-out methods have. This introduces some additional statistical trade-offs when using cross-methods, beyond the obvious increases in computational cost.

Specifically, cross-methods can introduce an information leak into the modeling process.


Andrews, Donald W. K. 1991. β€œAsymptotic Optimality of Generalized CL, Cross-Validation, and Generalized Cross-Validation in Regression with Heteroskedastic Errors.” Journal of Econometrics 47 (2): 359–77.
Bates, Stephen, Trevor Hastie, and Robert Tibshirani. n.d. β€œCross-Validation: What Does It Estimate and How Well Does It Do It?” 36.
BΓΌrkner, Paul-Christian, Jonah Gabry, and Aki Vehtari. 2020. β€œApproximate Leave-Future-Out Cross-Validation for Bayesian Time Series Models.” Journal of Statistical Computation and Simulation 90 (14): 2499–2523.
β€”β€”β€”. 2021. β€œEfficient Leave-One-Out Cross-Validation for Bayesian Non-Factorized Normal and Student-t Models.” Computational Statistics 36 (2): 1243–61.
Giordano, Ryan, Michael I. Jordan, and Tamara Broderick. 2019. β€œA Higher-Order Swiss Army Infinitesimal Jackknife.” arXiv:1907.12116 [Cs, Math, Stat], July.
Giordano, Ryan, Will Stephenson, Runjing Liu, Michael I. Jordan, and Tamara Broderick. 2019. β€œA Swiss Army Infinitesimal Jackknife.” In AISTATS.
Golub, Gene H., Michael Heath, and Grace Wahba. 1979. β€œGeneralized Cross-Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics 21 (2): 215–23.
Hall, Peter, Jeff Racine, and Qi Li. 2004. β€œCross-Validation and the Estimation of Conditional Probability Densities.” Journal of the American Statistical Association 99 (468): 1015–26.
Laan, Mark J. van der, Eric C. Polley, and Alan E. Hubbard. 2007. β€œSuper Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1).
Li, Ker-Chau. 1987. β€œAsymptotic Optimality for \(C_p, C_L\), Cross-Validation and Generalized Cross-Validation: Discrete Index Set.” The Annals of Statistics 15 (3): 958–75.
Perlich, Claudia, and Grzegorz Świrszcz. 2011. β€œOn Cross-Validation and Stacking: Building Seemingly Predictive Models on Random Data.” ACM SIGKDD Explorations Newsletter 12 (2): 11–15.
Polley, Eric C. 2010. β€œSuper Learner In Prediction.” U.C. Berkeley Division of Biostatistics Working Paper Series, May.
Sivula, Tuomas, MΓ₯ns Magnusson, and Aki Vehtari. 2020a. β€œUnbiased Estimator for the Variance of the Leave-One-Out Cross-Validation Estimator for a Bayesian Normal Model with Fixed Variance.” arXiv:2008.10859 [Stat], August.
β€”β€”β€”. 2020b. β€œUncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison.” arXiv:2008.10296 [Stat], October.
Stone, M. 1977. β€œAn Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion.” Journal of the Royal Statistical Society. Series B (Methodological) 39 (1): 44–47.
Wood, S. 1994. β€œMonotonic Smoothing Splines Fitted by Cross Validation.” SIAM Journal on Scientific Computing 15 (5): 1126–33.
Yao, Yuling, Aki Vehtari, Daniel Simpson, and Andrew Gelman. 2018. β€œUsing Stacking to Average Bayesian Predictive Distributions.” Bayesian Analysis 13 (3): 917–1007.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.