Teaching mathematics and especially statistics

2020-02-11 — 2025-01-18

Wherein curricula and resources for introducing statistics are surveyed, with emphasis on intuition‑building, Bayesian and computational approaches, and practical tools such as R and probabilistic spreadsheets.

academe

communicating

faster pussycat

learning

mind

statistics

How do I introduce people to statistics/data science/analytics? What is the most punchy, most efficient modern curriculum?

1 Pedagogy

I do not know much about the pedagogy of mathematics yet.¹

Here are some links that might help:

2 Foundational statistics

I do not mean “measure theoretic probability” but rather “intuition-building introductions to the project of learning about the world systematically from evidence.”

One incredible project is Hubbard (2014), the book by Douglas Hubbard which reframes all the traditional statistics in terms of measuring things. He then compresses an incredible amount of medium-to-advanced methodology into some Excel spreadsheets. The art is he gets lots of mileage out of statistical tricks that are usually emphasised for not being mathematically lavish enough to still make good exam questions.

The Curious Journalist’s Guide to Data By Jonathan Stray.

This is a book about the principles behind data journalism. Not what visualisation software to use and how to scrape a website, but the fundamental ideas that underlie the human use of data. This isn’t “how to use data” but “how data works.”

This gets into some of the mathy parts of statistics, but also the difficulty of taking a census of race and the cognitive psychology of probabilities. It traces where data comes from, what journalists do with it, and where it goes after—and tries to understand the possibilities and limitations. Data journalism is as interdisciplinary as it gets, which can make it difficult to assemble all the pieces you need. This is one attempt. This is a technical book, and uses standard technical language, but all mathematical concepts are explained through pictures and examples rather than formulas.

The life of data has three parts: quantification, analysis, and communication. Quantification is the process that creates data. Analysis involves rearranging the data or combining it with other information to produce new knowledge. And none of this is useful without communicating the result.

Carl T. Bergstrom and Jevin West, in Calling bullshit: Data Reasoning in a Digital World have excellent framing and a wide syllabus of different types of bullshit curation.

Miller (2013) attempts to systematize data analysis for non-specialists who need to analyse and present it to others. The table of contents looks incredible — kind of a minimum viable statistician program — but I have not read it.

3 Memes, puns and cartoons

4 Probability

Especially Bayes’ Theorem for discrete events. We can start from axiomatic probability theory but there is evidence that the best practice is by teaching frequencies (Gigerenzer and Hoffrage 1995; Sedlmeier and Gigerenzer 2001).

David Spiegelhalter, Using expected frequencies when teaching probability.
Great Expectations: Probability Through Problems
DRMacIver’s Notebook: Probably enough probability for you
I wrote and animated a lecture about probability.

5 As computational practice

In courses “for hackers” we attempt to give coders stats skills by leveraging their coding skills. I think there is some interesting stuff to be done there, because coding can get you to lots of the same place as mathematics. Cameron Davidson-Pilon, Probabilistic Programming & Bayesian Methods for Hackers (source) is worth trying.

There are some more classical approaches, of course. Here are some freely available online.

Mine Çetinkaya-Rundel and Johanna Hardin, Introduction to Modern Statistics
Intro to Statistics In One Hour
Applied Statistics with R
Project Mosaic

publishes university-level texts in statistics, data science, modeling, and scientific computing.

Handsome lookin’ statistics options include Daniel T. Kaplan’s Statistical Modeling: A Fresh Approach, and his guide to computational calculus.

Going even deeper down this hole, A Data-Centric Introduction to Computing:

we propose a new perspective on structuring computing curricula, which we call data centricity. We view a data-centric curriculum as

data centric = data science + data structures

in that order: we begin with ideas from data science, before shifting to classical ideas from data structures and the rest of computer science. This book lays out this vision concretely and in detail.

Second, computing education talks a great deal about notional machines—abstractions of program behaviour meant to help students understand how programs work—but few curricula actually use one. We take notional machines seriously, developing a sequence of them and weaving them through the curriculum. This ties to our belief that programs are not only objects that run, but also objects that we reason about.

Third, we weave content on socially-responsible computing into the text. Unlike other efforts that focus on exposing students to ethics or the pitfalls of technology in general, we aim to show students how the constructs and concepts that they are turning into code right now can lead to adverse impacts unless used with care. In keeping with our focus on testing and concrete examples, we introduce several topics by getting students to think about assumptions at the level of concrete data. This material is called out explicitly throughout the book.

5.1 Philosophical / general

Jonathan Stray, The Curious Journalist’s Guide to Data
Cathy O’Neil, Weapons of Math Destruction is a guide to how the methods we are learning are abused
Daniel T. Kaplan’s guide to computational calculus teaches you how to cheat at calculus.
Miller (2013) is a course targeted at, e.g. journalists to write about how they get their conclusions from data.

5.5 Hypothesis testing

See statistical tests. My question is: do I need to teach this? Is it ever what my students actually need?

5.6 Regression

Daniel T. Kaplan’s Statistical Modeling: A Fresh Approach has nice illustrations of resampling.
Cosma Rohilla Shalizi, Advanced Data Analysis from an Elementary Point of View (entire book free online).
Bradley Efron and Trevor Hastie, Computer Age Statistical inference (entire book free online)

6 Bayesian inference

I feel that Bayesian inference is probably best done at length, ideally textbook because it need more intuition-building. For a contrary viewpoint, see (Gigerenzer and Hoffrage 1995; Sedlmeier and Gigerenzer 2001)

For some excellent lengthy treatments, see Bayesian inference.

6.1 Teaching R

See R.

7 Tools

probabilistic spreadsheets
manim is a tool for pedagogic mathematical animations
quarto is designed to integrate plots and slides etc
Desmos—Let’s learn together. graphing calculator online

7.1 geogebra

geogebra is a neat Java web app which creates nice plots pedagogically.

8 Practice-led

Tutorials targeting domain specialists who want to learn data science

Kaggle Tutorials on Python, Data Viz, Pandas & More
carpentries incubator: ml-python-supervised-learning: Supervised Learning with Python
carpentries incubator: Introduction to Machine Learning with Scikit Learn
fast.ai’s Practical Deep Learning for Coders

9 Moore method

The Moore method sounds delightful and terrifying (Chalice 1995; Cohen 1982; Zitarelli 2004).

The way the course is conducted varies from instructor to instructor, but the content of the course is usually presented in whole or in part by the students themselves. Instead of using a textbook, the students are given a list of definitions and, based on these, theorems which they are to prove and present in class, leading them through the subject material. The Moore method typically limits the amount of material that a class is able to cover, but its advocates claim that it induces a depth of understanding that listening to lectures cannot give.

See R.L. Moore and the Moore Method.

Anyway, for all that this method sounds fun, I wonder how supported it is by evidence? Should check that.

Aside: It’s a pity the creator of the was racist even by the standards of his time and place; Has he done damage to both people and pedagogy thereby? He would have refused to teach some of my colleagues.

10 Incoming

The Math Academy Way: Using the Power of Science to Supercharge Student Learning
Arbital Bayes’ rule: Guide
rmcelreath/stat_rethinking_2023: Statistical Rethinking Course for Jan-Mar 2023
Larry Wasserman’s stats course
Shalizi’s regression lectures
Moritz Hardt, Benjamin Recht Patterns, predictions, and actions: A story about machine learning
May I draw your attention especially to Kroese et al. (2019), which I proof-read for my PhD supervisor Zdravko Botev, and enjoyed greatly? It smoothly bridges non-statistics mathematicians into applied statistics, without being excruciating, unlike layperson introductions. It is now freely available online.
Cosma’s links, targeted more to students committed to being statisticians.
Intro to Statistics In One Hour
Live Free or Dichotomize is full of examples.
There are also statistics podcasts.
Data Science: A First Introduction
The Book of Statistical Proofs (Soch et al. 2020) is really good! Simple proof of basic probability results shorn of all fluff, oriented to practical application.

The Book of Statistical Proofs – a centralized, open and collaboratively edited archive of statistical theorems for the computational sciences
Thinking and Explaining

I’m under the impression that mathematicians often have unspoken thought processes guiding their work which may be difficult to explain, or they feel too inhibited to try. One prototypical situation is this: there’s a mathematical object that’s obviously (to you) invariant under a certain transformation. For instance, a linear map might conserve volume for an ‘obvious’ reason. But you don’t have good language to explain your reason—so instead of explaining, or perhaps after trying to explain and failing, you fall back on computation. You turn the crank and without undue effort, demonstrate that the object is indeed invariant.

Here’s a specific example. Once I mentioned this phenomenon to Andy Gleason; he immediately responded that when he taught algebra courses, if he was discussing cyclic subgroups of a group, he had a mental image of group elements breaking into a formation organised into circular groups. He said that ‘we’ never would say anything like that to the students. His words made a vivid picture in my head, because it fit with how I thought about groups. I was reminded of my long struggle as a student, trying to attach meaning to ‘group’, rather than just a collection of symbols, words, definitions, theorems and proofs that I read in a textbook.

Numbas

Numbas is an online assessment system designed for mathematical subjects.

Developed by mathematicians at Newcastle University, Numbas is free to use and open-source.

Create a test in the online editor.

Share a link with your students, or upload it to your learning environment.

Students get randomised questions in their browser.

Answers are marked automatically and feedback is instant.

11 References

Chalice. 1995. “How to Teach a Class by the Modified Moore Method.” The American Mathematical Monthly.

Cohen. 1982. “A Modified Moore Method for Teaching Undergraduate Mathematics.” The American Mathematical Monthly.

Craiu, Gong, and Meng. 2023. “Six Statistical Senses.” Annual Review of Statistics and Its Application.

Gelman, Carlin, Stern, et al. 2013. Bayesian Data Analysis. Chapman & Hall/CRC texts in statistical science.

Gelman, Hill, and Vehtari. 2021. Regression and other stories.

Gelman, and Loken. 2012. “Statisticians: When We Teach, We Don’t Practice What We Preach.” Chance.

Gelman, and Nolan. 2017. Teaching Statistics: A Bag of Tricks.

Gigerenzer, and Hoffrage. 1995. “How to Improve Bayesian Reasoning Without Instruction: Frequency Formats.” Psychological Review.

Good, and Good. 1999. Resampling Methods: A Practical Guide to Data Analysis.

Hubbard. 2014. How to Measure Anything: Finding the Value of Intangibles in Business.

Kohavi, Tang, and Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing.

Krishnamurthi, and Fisler. 2020. “Data-Centricity: A Challenge and Opportunity for Computing Education.” Communications of the ACM.

Kroese, Botev, Taimre, et al. 2019. Mathematical and Statistical Methods for Data Science and Machine Learning. Chapman & Hall/CRC Machine Learning & Pattern Recognition.

Loredo, and Wolpert. 2024. “Bayesian Inference: More Than Bayes’s Theorem.”

Lovett, Meyer, and Thille. 2008. “JIME - The Open Learning Initiative: Measuring the Effectiveness of the OLI Statistics Course in Accelerating Student Learning.” Journal of Interactive Media in Education.

McElreath. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and STAN.

McElreath, and Boyd. 2007. Mathematical Models of Social Evolution: A Guide for the Perplexed.

Miller. 2013. The Chicago Guide to Writing about Multivariate Analysis. Chicago Guides to Writing, Editing, and Publishing.

Nolan, and Stoudt. 2020. “Reading to Write.” Significance.

Osborne. 2022. “Science Education in an Age of Misinformation.”

Sedlmeier, and Gigerenzer. 2001. “Teaching Bayesian reasoning in less than two hours.” Journal of Experimental Psychology. General.

Soch, Proofs, Faulkenberry, et al. 2020. “StatProofBook/StatProofBook.github.io: StatProofBook 2020.”

Zitarelli. 2004. “The Origin and Early Impact of the Moore Method.” The American Mathematical Monthly.

Footnotes

Pedantic note: one of my teacher friends points out that pedagogy is technically for educating children, while andragogy is for adults. But another friend thinks this term is sexist and also in university we are all so ignorant in the theory of teaching that we don’t know about this distinction, so I run with pedagogy.↩︎