Statistics and machine learning

This page mostly exists to collect a good selection of overview statistics introductions that are not terrible. I’m especially interested in modern fusion methods that harmonise what we would call statistics and machine learning methods, and the unnecessary terminological confusion between those systems.

Here are some recommended courses to get started if you don’t know what you’re doing.

See also the recommended texts below. May I draw your attention especially to Kroese et al. (2019), which I proof-read for my supervisor Zdravko Botev, and enjoyed greatly? It smoothly bridges non-statistics mathematicians into applied statistics, without being excruciating, unlike layperson introductions.

There are also statistics podcasts.


Cox, D. R., and D. V. Hinkley. 2000. Theoretical Statistics. Boca Raton: Chapman & Hall/CRC.
Dadkhah, Kamran. 2011. Foundations of Mathematical and Computational Economics.
Devroye, Luc, László Györfi, and Gábor Lugosi. 1996. A Probabilistic Theory of Pattern Recognition. New York: Springer. gyorfi/pbook.pdf.
Efron, Bradley, and Trevor Hastie. 2016. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Institute of Mathematical Statistics Monographs. New York, NY: Cambridge University Press.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3 edition. Chapman & Hall/CRC Texts in Statistical Science. Boca Raton: Chapman and Hall/CRC.
Guttman, Louis. 1977. “What Is Not What in Statistics.” Journal of the Royal Statistical Society. Series D (The Statistician) 26 (2): 81–107.
Guttorp, Peter. 1995. Stochastic Modeling of Scientific Data. 1. ed. Stochastic Modeling Series. London: Chapman & Hall.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer.
Kobayashi, Hisashi, Brian L. Mark, and William Turin. 2011. Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance. Cambridge University Press.
Kroese, Dirk P., Zdravko I. Botev, Thomas Taimre, and Radislav Vaisman. 2019. Mathematical and Statistical Methods for Data Science and Machine Learning. First edition. Chapman & Hall/CRC Machine Learning & Pattern Recognition. Boca Raton: CRC Press.
Lehmann, E. L., and George Casella. 1998. Theory of Point Estimation. 2nd ed. Springer Texts in Statistics. New York: Springer.
Lehmann, Erich L., and Joseph P. Romano. 2010. Testing Statistical Hypotheses. 3. ed. Springer Texts in Statistics. New York, NY: Springer.
Mohri, Mehryar, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. Foundations of Machine Learning. Second edition. Adaptive Computation and Machine Learning. Cambridge, Massachusetts: The MIT Press.
Murphy, Kevin P. 2012. Machine Learning: A Probabilistic Perspective. 1 edition. Adaptive Computation and Machine Learning Series. Cambridge, MA: MIT Press.
Robert, Christian P., and George Casella. 2004. Monte Carlo Statistical Methods. 2nd ed. Springer Texts in Statistics. New York: Springer.
Schervish, Mark J. 2012. Theory of Statistics. Springer Series in Statistics. New York, NY: Springer Science & Business Media.
Vaart, Aad W. van der. 2007. Asymptotic Statistics. 1. paperback ed., 8. printing. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge Univ. Press.
Wasserman, Larry. 2013. All of Statistics: A Concise Course in Statistical Inference. Springer.