Teaching mathematics and statistics

How do I introduce people ot statistics/data science/analytics? What is the most punchy, most efficient modern curriculum?


I do not know much about pedagogy of mathematics yet.

Here are some links that might help:

See also general pedagogy.

Foundational statistics

I do not mean “measure theoretic probability” but rather “intuition-building introductions to the project of learning about the world systematically from evidence.”

One incredible project is Hubbard (2014), the book by Douglas Hubbard which reframes all the traditional statistics in terms of measuring things. He then compresses an incredible amount of medium-to advanced methodology into some excel spreadsheets. The art is he gets lots of mileage out of statistical tricks that are usually emphasised for not being mathematically lavish enough to still make good exam questions.

The Curious Journalist’s Guide to Data By Jonathan Stray.

This is a book about the principles behind data journalism. Not what visualization software to use and how to scrape a website, but the fundamental ideas that underlie the human use of data. This isn’t “how to use data” but “how data works.”

This gets into some of the mathy parts of statistics, but also the difficulty of taking a census of race and the cognitive psychology of probabilities. It traces where data comes from, what journalists do with it, and where it goes after—and tries to understand the possibilities and limitations. Data journalism is as interdisciplinary as it gets, which can make it difficult to assemble all the pieces you need. This is one attempt. This is a technical book, and uses standard technical language, but all mathematical concepts are explained through pictures and examples rather than formulas.

The life of data has three parts: quantification, analysis, and communication. Quantification is the process that creates data. Analysis involves rearranging the data or combining it with other information to produce new knowledge. And none of this is useful without communicating the result.

Carl T. Bergstrom and Jevin West, in Calling bullshit: Data Reasoning in a Digital World have excellent framing and a wide syllabus of different types of bullshit curation.

Miller (2013) attempts to systematize data analysis for non-specialists tho need to analyse and present it to others. The table of contents looks incredible — kind of a minimum viable statistician program — but I have not read it.


Esp Bayes’ Theorem for discrete events. We can start from axiomatic probability theory but there is evidence that the best practice is by teaching frequencies (Gigerenzer and Hoffrage 1995; Sedlmeier and Gigerenzer 2001).

As computational practice

In courses “for hackers” is we attempt to give coders stats skills by leveraging their coding skills. I think there is some interesting stuff to be done there, because coding can get you to lots of the same place as mathematics. Cameron Davidson-Pilon, Probabilistic Programming & Bayesian Methods for Hackers (source) is worth trying.

There are some more classical approaches, of course. Here are some freely available online.

Going even deeper down this hole, A Data-Centric Introduction to Computing:

we propose a new perspective on structuring computing curricula, which we call data centricity. We view a data-centric curriculum as

data centric = data science + data structures

in that order: we begin with ideas from data science, before shifting to classical ideas from data structures and the rest of computer science. This book lays out this vision concretely and in detail.

Second, computing education talks a great deal about notional machines—abstractions of program behavior meant to help students understand how programs work—but few curricula actually use one. We take notional machines seriously, developing a sequence of them and weaving them through the curriculum. This ties to our belief that programs are not only objects that run, but also objects that we reason about.

Third, we weave content on socially-responsible computing into the text. Unlike other efforts that focus on exposing students to ethics or the pitfalls of technology in general, we aim to show students how the constructs and concepts that they are turning into code right now can lead to adverse impacts unless used with care. In keeping with our focus on testing and concrete examples, we introduce several topics by getting students to think about assumptions at the level of concrete data. This material is called out explicitly throughout the book.

Philosophical / general


See bootstrap.

Hierarchical models

See hierarchical models.

Causal inference

See causal inference.

Hypothesis testing

See statistical tests. My question is: do I need to teach this? Is it ever what my students actually need?


Bayesian inference

For various reasons, probably best done at textbook length. Here are some that look fun.

See Bayesian inference.

Teaching R

See R.




Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3 edition. Chapman & Hall/CRC texts in statistical science. Boca Raton: Chapman and Hall/CRC.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and other stories. Cambridge, UK: Cambridge University Press.
Gelman, Andrew, and Eric Loken. 2012. “Statisticians: When We Teach, We Don’t Practice What We Preach.” Chance 25 (1): 47–48.
Gelman, Andrew, and Deborah Nolan. 2017. Teaching Statistics: A Bag of Tricks. 2 edition. Oxford: Oxford University Press.
Gigerenzer, Gerd, and Ulrich Hoffrage. 1995. How to Improve Bayesian Reasoning Without Instruction: Frequency Formats.” Psychological Review 102 (4): 684–704.
Good, Phillip I., and Philip Good. 1999. Resampling Methods: A Practical Guide to Data Analysis. Birkhäuser Basel.
Hubbard, Douglas W. 2014. How to Measure Anything: Finding the Value of Intangibles in Business. 3 edition. Hoboken, New Jersey: Wiley.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge, United Kingdom ; New York, NY: Cambridge University Press.
Krishnamurthi, Shriram, and Kathi Fisler. 2020. Data-Centricity: A Challenge and Opportunity for Computing Education.” Communications of the ACM 63 (8): 24–26.
Kroese, Dirk P., Zdravko I. Botev, Thomas Taimre, and Radislav Vaisman. 2019. Mathematical and Statistical Methods for Data Science and Machine Learning. First edition. Chapman & Hall/CRC Machine Learning & Pattern Recognition. Boca Raton: CRC Press.
Lovett, Marsha, Oded Meyer, and Candace Thille. 2008. JIME - The Open Learning Initiative: Measuring the Effectiveness of the OLI Statistics Course in Accelerating Student Learning.” Journal of Interactive Media in Education 2008 (1): Art. 13.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and STAN. Boca Raton: CRC Press.
McElreath, Richard, and Robert Boyd. 2007. Mathematical Models of Social Evolution: A Guide for the Perplexed. University Of Chicago Press.
Miller, Jane E. 2013. The Chicago Guide to Writing about Multivariate Analysis. Second edition. Chicago Guides to Writing, Editing, and Publishing. Chicago: University of Chicago Press.
Nolan, Deborah, and Sara Stoudt. 2020. Reading to Write.” Significance 17 (6): 34–37.
Osborne, Jonathan. 2022. Science Education in an Age of Misinformation.”
Sedlmeier, P., and G. Gigerenzer. 2001. Teaching Bayesian reasoning in less than two hours.” Journal of Experimental Psychology. General 130 (3): 380–400.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.