On choosing the right model and regularisation parameter in sparse regression, which turn out to be nearly the same, and closely coupled to doing the regression. There are some wrinkles.
π Talk about when degrees-of-freedom penalties work, when cross-validation and so on.
FOCI
The new hotness sweeping the world is FOCI, a sparse model selection procedure (Azadkia and Chatterjee 2019) based on Chatterjeeβs ΞΎ statistic as an independence test test. (Chatterjee 2020). Looks interesting.
Relaxed Lasso
π
Dantzig Selector
π
Garotte
π
Degrees-of-freedom penalties
See degrees of freedom.
References
Abramovich, Felix, Yoav Benjamini, David L. Donoho, and Iain M. Johnstone. 2006. βAdapting to Unknown Sparsity by Controlling the False Discovery Rate.β The Annals of Statistics 34 (2): 584β653.
Azadkia, Mona, and Sourav Chatterjee. 2019. βA Simple Measure of Conditional Dependence.β arXiv:1910.12327 [Cs, Math, Stat], December.
Banerjee, Onureena, Laurent El Ghaoui, and Alexandre dβAspremont. 2008. βModel Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data.β Journal of Machine Learning Research 9 (Mar): 485β516.
Barbier, Jean. 2015. βStatistical Physics and Approximate Message-Passing Algorithms for Sparse Linear Estimation Problems in Signal Processing and Coding Theory.β arXiv:1511.01650 [Cs, Math], November.
Barron, Andrew R., Albert Cohen, Wolfgang Dahmen, and Ronald A. DeVore. 2008. βApproximation and Learning by Greedy Algorithms.β The Annals of Statistics 36 (1): 64β94.
Bayati, M., and A. Montanari. 2012. βThe LASSO Risk for Gaussian Matrices.β IEEE Transactions on Information Theory 58 (4): 1997β2017.
Berk, Richard, Lawrence Brown, Andreas Buja, Kai Zhang, and Linda Zhao. 2013. βValid Post-Selection Inference.β The Annals of Statistics 41 (2): 802β37.
Bertin, K., E. Le Pennec, and V. Rivoirard. 2011. βAdaptive Dantzig Density Estimation.β Annales de lβInstitut Henri PoincarΓ©, ProbabilitΓ©s Et Statistiques 47 (1): 43β74.
Bertsimas, Dimitris, Angela King, and Rahul Mazumder. 2016. βBest Subset Selection via a Modern Optimization Lens.β The Annals of Statistics 44 (2): 813β52.
Bondell, Howard D., Arun Krishna, and Sujit K. Ghosh. 2010. βJoint Variable Selection for Fixed and Random Effects in Linear Mixed-Effects Models.β Biometrics 66 (4): 1069β77.
Breiman, Leo. 1995. βBetter Subset Regression Using the Nonnegative Garrote.β Technometrics 37 (4): 373β84.
BΓΌhlmann, Peter, and Sara van de Geer. 2015. βHigh-Dimensional Inference in Misspecified Linear Models.β arXiv:1503.06426 [Stat] 9 (1): 1449β73.
Bunea, Florentina, Alexandre B. Tsybakov, and Marten H. Wegkamp. 2007a. βSparse Density Estimation with β1 Penalties.β In Learning Theory, edited by Nader H. Bshouty and Claudio Gentile, 530β43. Lecture Notes in Computer Science. Springer Berlin Heidelberg.
Bunea, Florentina, Alexandre Tsybakov, and Marten Wegkamp. 2007b. βSparsity Oracle Inequalities for the Lasso.β Electronic Journal of Statistics 1: 169β94.
Carmi, Avishy Y. 2014. βCompressive System Identification.β In Compressed Sensing & Sparse Filtering, edited by Avishy Y. Carmi, Lyudmila Mihaylova, and Simon J. Godsill, 281β324. Signals and Communication Technology. Springer Berlin Heidelberg.
Chatterjee, Sourav. 2020. βA New Coefficient of Correlation.β arXiv:1909.10140 [Math, Stat], January.
Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2016. βDouble/Debiased Machine Learning for Treatment and Causal Parameters.β arXiv:1608.00060 [Econ, Stat], July.
Chernozhukov, Victor, Christian Hansen, Yuan Liao, and Yinchu Zhu. 2018. βInference For Heterogeneous Effects Using Low-Rank Estimations.β arXiv:1812.08089 [Math, Stat], December.
Chernozhukov, Victor, Whitney K. Newey, and Rahul Singh. 2018. βLearning L2 Continuous Regression Functionals via Regularized Riesz Representers.β arXiv:1809.05224 [Econ, Math, Stat], September.
Chetverikov, Denis, Zhipeng Liao, and Victor Chernozhukov. 2016. βOn Cross-Validated Lasso.β arXiv:1605.02214 [Math, Stat], May.
Chichignoud, MichaΓ«l, Johannes Lederer, and Martin Wainwright. 2014. βA Practical Scheme and Fast Algorithm to Tune the Lasso With Optimality Guarantees.β arXiv:1410.0247 [Math, Stat], October.
Descloux, Pascaline, and Sylvain Sardy. 2018. βModel Selection with Lasso-Zero: Adding Straw to the Haystack to Better Find Needles.β arXiv:1805.05133 [Stat], May.
Dossal, Charles, Maher Kachour, Jalal M. Fadili, Gabriel PeyrΓ©, and Christophe Chesneau. 2011. βThe Degrees of Freedom of the Lasso for General Design Matrix.β arXiv:1111.1162 [Cs, Math, Stat], November.
El Karoui, Noureddine. 2008. βOperator Norm Consistent Estimation of Large Dimensional Sparse Covariance Matrices.β University of California, Berkeley 36 (6): 2717β56.
Ewald, Karl, and Ulrike Schneider. 2015. βConfidence Sets Based on the Lasso Estimator.β arXiv:1507.05315 [Math, Stat], July.
Fan, Jianqing, and Runze Li. 2001. βVariable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties.β Journal of the American Statistical Association 96 (456): 1348β60.
Fan, Jianqing, and Jinchi Lv. 2010. βA Selective Overview of Variable Selection in High Dimensional Feature Space.β Statistica Sinica 20 (1): 101β48.
Flynn, Cheryl J., Clifford M. Hurvich, and Jeffrey S. Simonoff. 2013. βEfficiency for Regularization Parameter Selection in Penalized Likelihood Estimation of Misspecified Models.β arXiv:1302.2068 [Stat], February.
Freijeiro-GonzΓ‘lez, Laura, Manuel Febrero-Bande, and Wenceslao GonzΓ‘lez-Manteiga. 2022. βA Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates.β International Statistical Review 90 (1): 118β45.
Geer, Sara A. van de. 2008. βHigh-Dimensional Generalized Linear Models and the Lasso.β The Annals of Statistics 36 (2): 614β45.
Geer, Sara A. van de, Peter BΓΌhlmann, and Shuheng Zhou. 2011. βThe Adaptive and the Thresholded Lasso for Potentially Misspecified Models (and a Lower Bound for the Lasso).β Electronic Journal of Statistics 5: 688β749.
Geer, Sara van de. 2016. Estimation and Testing Under Sparsity. Vol. 2159. Lecture Notes in Mathematics. Cham: Springer International Publishing.
Hall, Peter, Jiashun Jin, and Hugh Miller. 2014. βFeature Selection When There Are Many Influential Features.β Bernoulli 20 (3): 1647β71.
Hall, Peter, and Jing-Hao Xue. 2014. βOn Selecting Interacting Features from High-Dimensional Data.β Computational Statistics & Data Analysis 71 (March): 694β708.
Hansen, Niels Richard, Patricia Reynaud-Bouret, and Vincent Rivoirard. 2015. βLasso and Probabilistic Inequalities for Multivariate Point Processes.β Bernoulli 21 (1): 83β143.
Hastie, Trevor J., Tibshirani, Rob, and Martin J. Wainwright. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: Chapman and Hall/CRC.
Hastie, Trevor, Robert Tibshirani, and Ryan J. Tibshirani. 2017. βExtended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso.β arXiv.
Hirose, Kei, Shohei Tateishi, and Sadanori Konishi. 2011. βEfficient Algorithm to Select Tuning Parameters in Sparse Regression Modeling with Regularization.β arXiv:1109.2411 [Stat], September.
Huang, Cong, G. L. H. Cheang, and Andrew R. Barron. 2008. βRisk of Penalized Least Squares, Greedy Selection and L1 Penalization for Flexible Function Libraries.β
JankovΓ‘, Jana, and Sara van de Geer. 2016. βConfidence Regions for High-Dimensional Generalized Linear Models Under Sparsity.β arXiv:1610.01353 [Math, Stat], October.
Javanmard, Adel, and Andrea Montanari. 2014. βConfidence Intervals and Hypothesis Testing for High-Dimensional Regression.β Journal of Machine Learning Research 15 (1): 2869β909.
Kato, Kengo. 2009. βOn the Degrees of Freedom in Shrinkage Estimation.β Journal of Multivariate Analysis 100 (7): 1338β52.
Kim, Yongdai, Sunghoon Kwon, and Hosik Choi. 2012. βConsistent Model Selection Criteria on High Dimensions.β Journal of Machine Learning Research 13 (Apr): 1037β57.
Koltchinskii, Prof Vladimir. 2011. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Lecture Notes in Mathematics Γcole dβΓtΓ© de ProbabilitΓ©s de Saint-Flour 2033. Heidelberg: Springer.
Lam, Clifford, and Jianqing Fan. 2009. βSparsistency and Rates of Convergence in Large Covariance Matrix Estimation.β Annals of Statistics 37 (6B): 4254β78.
Lederer, Johannes, and Michael Vogt. 2020. βEstimating the Lassoβs Effective Noise.β arXiv:2004.11554 [Stat], April.
Lee, Jason D., Dennis L. Sun, Yuekai Sun, and Jonathan E. Taylor. 2013. βExact Post-Selection Inference, with Application to the Lasso.β arXiv:1311.6238 [Math, Stat], November.
Lemhadri, Ismael, Feng Ruan, Louis Abraham, and Robert Tibshirani. 2021. βLassoNet: A Neural Network with Feature Sparsity.β Journal of Machine Learning Research 22 (127): 1β29.
Li, Wei, and Johannes Lederer. 2019. βTuning Parameter Calibration for β1-Regularized Logistic Regression.β Journal of Statistical Planning and Inference 202 (September): 80β98.
Lim, NΓ©hΓ©my, and Johannes Lederer. 2016. βEfficient Feature Selection With Large and High-Dimensional Data.β arXiv:1609.07195 [Stat], September.
Lockhart, Richard, Jonathan Taylor, Ryan J. Tibshirani, and Robert Tibshirani. 2014. βA Significance Test for the Lasso.β The Annals of Statistics 42 (2): 413β68.
Lundberg, Scott M, and Su-In Lee. 2017. βA Unified Approach to Interpreting Model Predictions.β In Advances in Neural Information Processing Systems. Vol. 30. Curran Associates, Inc.
Meinshausen, Nicolai, and Peter BΓΌhlmann. 2006. βHigh-Dimensional Graphs and Variable Selection with the Lasso.β The Annals of Statistics 34 (3): 1436β62.
Meinshausen, Nicolai, and Bin Yu. 2009. βLasso-Type Recovery of Sparse Representations for High-Dimensional Data.β The Annals of Statistics 37 (1): 246β70.
Naik, Prasad A., and Chih-Ling Tsai. 2001. βSingleβindex Model Selections.β Biometrika 88 (3): 821β32.
Nickl, Richard, and Sara van de Geer. 2013. βConfidence Sets in Sparse Regression.β The Annals of Statistics 41 (6): 2852β76.
Portnoy, Stephen, and Roger Koenker. 1997. βThe Gaussian Hare and the Laplacian Tortoise: Computability of Squared-Error Versus Absolute-Error Estimators.β Statistical Science 12 (4): 279β300.
Reynaud-Bouret, Patricia. 2003. βAdaptive Estimation of the Intensity of Inhomogeneous Poisson Processes via Concentration Inequalities.β Probability Theory and Related Fields 126 (1).
Reynaud-Bouret, Patricia, and Sophie Schbath. 2010. βAdaptive Estimation for Hawkes Processes; Application to Genome Analysis.β The Annals of Statistics 38 (5): 2781β2822.
Semenova, Lesia, Cynthia Rudin, and Ronald Parr. 2021. βA Study in Rashomon Curves and Volumes: A New Perspective on Generalization and Model Simplicity in Machine Learning.β arXiv:1908.01755 [Cs, Stat], April.
Shen, Xiaotong, and Hsin-Cheng Huang. 2006. βOptimal Model Assessment, Selection, and Combination.β Journal of the American Statistical Association 101 (474): 554β68.
Shen, Xiaotong, Hsin-Cheng Huang, and Jimmy Ye. 2004. βAdaptive Model Selection and Assessment for Exponential Family Distributions.β Technometrics 46 (3): 306β17.
Shen, Xiaotong, and Jianming Ye. 2002. βAdaptive Model Selection.β Journal of the American Statistical Association 97 (457): 210β21.
Tarr, Garth, Samuel MΓΌller, and Alan H. Welsh. 2018. βMplot: An R Package for Graphical Model Stability and Variable Selection Procedures.β Journal of Statistical Software 83 (1): 1β28.
Tibshirani, Robert. 1996. βRegression Shrinkage and Selection via the Lasso.β Journal of the Royal Statistical Society. Series B (Methodological) 58 (1): 267β88.
Tibshirani, Ryan J. 2014. βA General Framework for Fast Stagewise Algorithms.β arXiv:1408.5801 [Stat], August.
Wang, Hansheng, Guodong Li, and Guohua Jiang. 2007. βRobust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso.β Journal of Business & Economic Statistics 25 (3): 347β55.
Xu, H., C. Caramanis, and S. Mannor. 2012. βSparse Algorithms Are Not Stable: A No-Free-Lunch Theorem.β IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (1): 187β93.
Yoshida, Ryo, and Mike West. 2010. βBayesian Learning in Sparse Graphical Factor Models via Variational Mean-Field Annealing.β Journal of Machine Learning Research 11 (May): 1771β98.
Yuan, Ming, and Yi Lin. 2006. βModel Selection and Estimation in Regression with Grouped Variables.β Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 (1): 49β67.
βββ. 2007. βModel Selection and Estimation in the Gaussian Graphical Model.β Biometrika 94 (1): 19β35.
Zhang, Cun-Hui. 2010. βNearly Unbiased Variable Selection Under Minimax Concave Penalty.β The Annals of Statistics 38 (2): 894β942.
Zhang, Cun-Hui, and Stephanie S. Zhang. 2014. βConfidence Intervals for Low Dimensional Parameters in High Dimensional Linear Models.β Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76 (1): 217β42.
Zhang, Yiyun, Runze Li, and Chih-Ling Tsai. 2010. βRegularization Parameter Selections via Generalized Information Criterion.β Journal of the American Statistical Association 105 (489): 312β23.
Zhao, Peng, Guilherme Rocha, and Bin Yu. 2006. βGrouped and Hierarchical Model Selection Through Composite Absolute Penalties.β
βββ. 2009. βThe Composite Absolute Penalties Family for Grouped and Hierarchical Variable Selection.β The Annals of Statistics 37 (6A): 3468β97.
Zhao, Peng, and Bin Yu. 2006. βOn Model Selection Consistency of Lasso.β Journal of Machine Learning Research 7 (Nov): 2541β63.
Zou, Hui. 2006. βThe Adaptive Lasso and Its Oracle Properties.β Journal of the American Statistical Association 101 (476): 1418β29.
Zou, Hui, and Trevor Hastie. 2005. βRegularization and Variable Selection via the Elastic Net.β Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301β20.
Zou, Hui, Trevor Hastie, and Robert Tibshirani. 2007. βOn the βDegrees of Freedomβ of the Lasso.β The Annals of Statistics 35 (5): 2173β92.
No comments yet. Why not leave one?