Abbeel, Pieter, Daphne Koller, and Andrew Y. Ng. 2006.
“Learning Factor Graphs in Polynomial Time and Sample Complexity.” Journal of Machine Learning Research 7 (December): 1743–88.
Banerjee, Arindam, Sheng Chen, Farideh Fazayeli, and Vidyashankar Sivakumar. 2014.
“Estimation with Norm Regularization.” In
Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 1556–64. Curran Associates, Inc.
Barbier, Jean, Florent Krzakala, Nicolas Macris, Léo Miolane, and Lenka Zdeborová. 2017.
“Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models.” arXiv:1708.03395 [Cond-Mat, Physics:math-Ph], August.
Barbour, A. D., and Louis H. Y. Chen, eds. 2005.
An Introduction to Stein’s Method. Vol. 4. Lecture Notes Series / Institute for Mathematical Sciences, National University of Singapore, v. 4. Singapore : Hackensack, N.J: Singapore University Press ; World Scientific.
Barron, Andrew R., Cong Huang, Jonathan Q. Li, and Xi Luo. 2008.
“MDL, Penalized Likelihood, and Statistical Risk.” In
Information Theory Workshop, 2008. ITW’08. IEEE, 247–57. IEEE.
Bartlett, Peter L, Michael I Jordan, and Jon D McAuliffe. 2006.
“Convexity, Classification, and Risk Bounds.” Journal of the American Statistical Association 101 (473): 138–56.
Bartlett, Peter L., and Shahar Mendelson. 2002.
“Rademacher and Gaussian Complexities: Risk Bounds and Structural Results.” Journal of Machine Learning Research 3 (Nov): 463–82.
Bellec, Pierre C., and Alexandre B. Tsybakov. 2016.
“Bounds on the Prediction Error of Penalized Least Squares Estimators with Convex Penalty.” arXiv:1609.06675 [Math, Stat], September.
Bertin, K., E. Le Pennec, and V. Rivoirard. 2011.
“Adaptive Dantzig Density Estimation.” Annales de l’Institut Henri Poincaré, Probabilités Et Statistiques 47 (1): 43–74.
Birgé, Lucien. 2008.
“Model Selection for Density Estimation with L2-Loss.” arXiv:0808.1416 [Math, Stat], August.
Blumer, Anselm, A. Ehrenfeucht, David Haussler, and Manfred K. Warmuth. 1989.
“Learnability and the Vapnik-Chervonenkis Dimension.” J. ACM 36 (4): 929–65.
Bousquet, Olivier, Stéphane Boucheron, and Gábor Lugosi. 2004.
“Introduction to Statistical Learning Theory.” In
Advanced Lectures on Machine Learning, edited by Olivier Bousquet, Ulrike von Luxburg, and Gunnar Rätsch, 169–207. Lecture Notes in Computer Science 3176. Springer Berlin Heidelberg.
Bousquet, Olivier, Steve Hanneke, and Shay Moran. 2020. “A Theory of Universal Learning,” 51.
Bunea, Florentina, Alexandre B. Tsybakov, and Marten H. Wegkamp. 2007a.
“Sparse Density Estimation with ℓ1 Penalties.” In
Learning Theory, edited by Nader H. Bshouty and Claudio Gentile, 530–43. Lecture Notes in Computer Science. Springer Berlin Heidelberg.
Bunea, Florentina, Alexandre Tsybakov, and Marten Wegkamp. 2007b.
“Sparsity Oracle Inequalities for the Lasso.” Electronic Journal of Statistics 1: 169–94.
Canas, Guillermo D., and Lorenzo Rosasco. 2012.
“Learning Probability Measures with Respect to Optimal Transport Metrics.” arXiv:1209.1077 [Cs, Stat], September.
Chichignoud, Michaël, Johannes Lederer, and Martin Wainwright. 2014.
“A Practical Scheme and Fast Algorithm to Tune the Lasso With Optimality Guarantees.” arXiv:1410.0247 [Math, Stat], October.
David, Ofir, Shay Moran, and Amir Yehudayoff. 2016.
“On Statistical Learning via the Lens of Compression.” arXiv:1610.03592 [Cs, Math], October.
Devroye, Luc, László Györfi, and Gábor Lugosi. 1996.
A Probabilistic Theory of Pattern Recognition. New York: Springer.
Dinh, Vu, Lam Si Tung Ho, Duy Nguyen, and Binh T. Nguyen. 2016.
“Fast Learning Rates with Heavy-Tailed Losses.” In
NIPS.
Flammia, Steven T., David Gross, Yi-Kai Liu, and Jens Eisert. 2012.
“Quantum Tomography via Compressed Sensing: Error Bounds, Sample Complexity, and Efficient Estimators.” New Journal of Physics 14 (9): 095022.
Foygel, Rina, and Nathan Srebro. 2011.
“Fast-Rate and Optimistic-Rate Error Bounds for L1-Regularized Regression.” arXiv:1108.0373 [Math, Stat], August.
Geer, Sara A. van de. 2008.
“High-Dimensional Generalized Linear Models and the Lasso.” The Annals of Statistics 36 (2): 614–45.
Geer, Sara van de. 2002. “On Hoeffdoing’s Inequality for Dependent Random Variables.” In Empirical Process Techniques for Dependent Data. Birkhhäuser.
———. 2014.
“Worst Possible Sub-Directions in High-Dimensional Models.” In
arXiv:1403.7023 [Math, Stat]. Vol. 131.
Gnecco, Giorgio, and Marcello Sanguineti. 2008.
“Approximation Error Bounds via Rademacher’s Complexity.” Applied Mathematical Sciences 2 (4): 153–76.
Golowich, Noah, Alexander Rakhlin, and Ohad Shamir. 2017.
“Size-Independent Sample Complexity of Neural Networks.” arXiv:1712.06541 [Cs, Stat], December.
Golubev, Grigori K., and Michael Nussbaum. 1990.
“A Risk Bound in Sobolev Class Regression.” The Annals of Statistics 18 (2): 758–78.
Goodfellow, Ian, Nicolas Papernot, and Patrick McDaniel. 2016.
“Cleverhans V0.1: An Adversarial Machine Learning Library.” arXiv:1610.00768 [Cs, Stat], October.
Gribonval, Rémi, Gilles Blanchard, Nicolas Keriven, and Yann Traonmilin. 2017.
“Compressive Statistical Learning with Random Feature Moments.” arXiv:1706.07180 [Cs, Math, Stat], June.
Hansen, Niels Richard, Patricia Reynaud-Bouret, and Vincent Rivoirard. 2015.
“Lasso and Probabilistic Inequalities for Multivariate Point Processes.” Bernoulli 21 (1): 83–143.
Hasminskii, Rafael, and Ildar Ibragimov. 1990.
“On Density Estimation in the View of Kolmogorov’s Ideas in Approximation Theory.” The Annals of Statistics 18 (3): 999–1010.
Hastie, Trevor J., Tibshirani, Rob, and Martin J. Wainwright. 2015.
Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: Chapman and Hall/CRC.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer.
Kabán, Ata. 2014.
“New Bounds on Compressive Linear Least Squares Regression.” In
Journal of Machine Learning Research, 448–56.
Kloft, Marius, Ulrich Rückert, and Peter L. Bartlett. 2010.
“A Unifying View of Multiple Kernel Learning.” In
Machine Learning and Knowledge Discovery in Databases, edited by José Luis Balcázar, Francesco Bonchi, Aristides Gionis, and Michèle Sebag, 66–81. Lecture Notes in Computer Science. Springer Berlin Heidelberg.
Koltchinskii, Prof Vladimir. 2011.
Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Lecture Notes in Mathematics École d’Été de Probabilités de Saint-Flour 2033. Heidelberg: Springer.
Krishnapuram, Balaji, Lawrence Carin, Mario A. T. Figueiredo, and Alexander J. Hartemink. 2005.
“Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (6): 957–68.
Kuznetsov, Vitaly, and Mehryar Mohri. 2015.
“Learning Theory and Algorithms for Forecasting Non-Stationary Time Series.” In
Advances in Neural Information Processing Systems, 541–49. Curran Associates, Inc.
Lederer, Johannes, and Sara van de Geer. 2014.
“New Concentration Inequalities for Suprema of Empirical Processes.” Bernoulli 20 (4): 2020–38.
Luxburg, Ulrike von, and Bernhard Schölkopf. 2008.
“Statistical Learning Theory: Models, Concepts, and Results.” arXiv:0810.4752 [Math, Stat], October.
Maclaurin, Dougal, David Duvenaud, and Ryan Adams. 2015.
“Gradient-Based Hyperparameter Optimization Through Reversible Learning.” In
Proceedings of the 32nd International Conference on Machine Learning, 2113–22. PMLR.
Maurya, Abhinav. 2018. “Optimal Transport in Statistical Machine Learning : Selected Review and Some Open Questions.” In.
McDonald, Daniel J., Cosma Rohilla Shalizi, and Mark Schervish. 2011a.
“Generalization Error Bounds for Stationary Autoregressive Models.” arXiv:1103.0942 [Cs, Stat], March.
———. 2011b.
“Risk Bounds for Time Series Without Strong Mixing.” arXiv:1106.0730 [Cs, Stat], June.
McDonald, Daniel J, Cosma Rohilla Shalizi, and Mark Schervish. 2011c.
“Estimated VC Dimension for Risk Bounds.” Eprint arXiv:1111.3404.
Mei, Song, Yu Bai, and Andrea Montanari. 2016.
“The Landscape of Empirical Risk for Non-Convex Losses.” arXiv:1607.06534 [Stat], July.
Meng, Qi, Yue Wang, Wei Chen, Taifeng Wang, Zhi-Ming Ma, and Tie-Yan Liu. 2016.
“Generalization Error Bounds for Optimization Algorithms via Stability.” In
arXiv:1609.08397 [Stat], 10:441–74.
Natarajan, B. K. 1989.
“On Learning Sets and Functions.” Machine Learning 4 (1): 67–97.
Oneto, Luca. 2018.
“Model Selection and Error Estimation Without the Agonizing Pain.” WIREs Data Mining and Knowledge Discovery 8 (4): e1252.
Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. 2013.
“On the Difficulty of Training Recurrent Neural Networks.” In
arXiv:1211.5063 [Cs], 1310–18.
Reinert, Gesine. 2000.
“Stein’s Method for Epidemic Processes.” In
Complex Stochastic Systems, 235–75. Boca Raton: Chapman & Hall/CRC.
Reynaud-Bouret, Patricia, and Sophie Schbath. 2010.
“Adaptive Estimation for Hawkes Processes; Application to Genome Analysis.” The Annals of Statistics 38 (5): 2781–2822.
Schölkopf, Bernhard, and Alexander J. Smola. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
———. 2003.
“A Short Introduction to Learning with Kernels.” In
Advanced Lectures on Machine Learning, edited by Shahar Mendelson and Alexander J. Smola, 41–64. Lecture Notes in Computer Science 2600. Springer Berlin Heidelberg.
Şimşekli, Umut, Ozan Sener, George Deligiannidis, and Murat A. Erdogdu. 2020.
“Hausdorff Dimension, Stochastic Differential Equations, and Generalization in Neural Networks.” CoRR abs/2006.09313.
Singh, Shashank, and Barnabás Póczos. 2018.
“Minimax Distribution Estimation in Wasserstein Distance.” arXiv:1802.08855 [Cs, Math, Stat], February.
Vainsencher, Daniel, Shie Mannor, and Alfred M. Bruckstein. 2011.
“The Sample Complexity of Dictionary Learning.” Journal of Machine Learning Research 12 (Nov): 3259–81.
Vapnik, Vladimir. 2010. The Nature of Statistical Learning Theory. Softcover reprint of hardcover 2nd ed. 2000. Springer.
Vapnik, Vladimir, Esther Levin, and Yann Le Cun. 1994.
“Measuring the VC-Dimension of a Learning Machine.” Neural Computation 6 (5): 851–76.
Wu, Qiang, and Ding-Xuan Zhou. 2008.
“Learning with Sample Dependent Hypothesis Spaces.” Computers & Mathematics with Applications 56 (11): 2896–2907.
Zhang, Chiyuan, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017.
“Understanding Deep Learning Requires Rethinking Generalization.” In
Proceedings of ICLR.
No comments yet. Why not leave one?