Bayes for beginners

Even for the most currmudgeonly frequentist it is sometimes refreshing to move your effort from deriving frequentist estimators for intractable models, to using the damn Bayesian ones, which fail in different and interesting ways than you are used to. If it works and you are feeling fancy you might then justify your Bayesian method on frequentist grounds, which washes away the sin.

Here are some scattered tidbits about getting into it. No attempt is made to be comprehensive, novel, or to even expert.


Course material

So many! Too many. Actually I kinda like McElreath’s stuff to teach from; You get practical quite quickly.

Linear regression

This workhorse pops up everywhere.

Deisenroth and Zafeiriou, Mathematics for Inference and Machine Learning give an ML perspective.


If we want to use Bayesian tools to do science there is a principled workflow that we need to be thinking about. For a fun rant read Shalizi on Praxis and Ideology in Bayesian Data Analysis, about Gelman and Shalizi (2013).

The visualisation howto from, basically, the Stan team, is a deeper than it sounds and highly recommended (Gabry et al. 2019).

Michael Betancourt’s examples, for example his workflow tips, are a good start for practical work, incorporating the inevitable collision of statistical and computational difficulties.

See also BAT the Bayesian Analysis Toolkit, which does sophisticated Bayes modelling although AFAICT uses a fairly basic Sampler?

Notes on Rao-Blackwellisation for doing faster MCMC inference, and even handling discrete parameters in Stan.


Dirichlet processes, Gaussian Process regression etc. πŸ—

As a methodology of science

Not quite.


Alquier, Pierre. 2021. β€œUser-Friendly Introduction to PAC-Bayes Bounds.” arXiv:2110.11216 [Cs, Math, Stat], October.
Bacchus, F, H E Kyburg, and M Thalos. 1990. β€œAgainst Conditionalization.” Synthese 85 (3): 475–506.
Barbier, Jean, and Nicolas Macris. 2017. β€œThe Stochastic Interpolation Method: A Simple Scheme to Prove Replica Formulas in Bayesian Inference.” arXiv:1705.02780 [Cond-Mat], May.
Bernardo, JosΓ© M., and Adrian F. M. Smith. 2000. Bayesian Theory. 1 edition. Chichester: Wiley.
Carpenter, Bob, Matthew D. Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, and Michael Betancourt. 2015. β€œThe Stan Math Library: Reverse-Mode Automatic Differentiation in C++.” arXiv Preprint arXiv:1509.07164.
Caruana, Rich. 1998. β€œMultitask Learning.” In Learning to Learn, 95–133. Springer, Boston, MA.
Deisenroth, Marc, and Stefanos Zafeiriou. 2017. β€œMathematics for Inference and Machine Learning.” Dept. Comput., Imperial College London, London, UK, Tech. Rep., Accessed on Jul, 126.
Diaconis, Persi, and Donald Ylvisaker. 1979. β€œConjugate Priors for Exponential Families.” The Annals of Statistics 7 (2): 269–81.
Domingos, Pedro. 2020. β€œEvery Model Learned by Gradient Descent Is Approximately a Kernel Machine.” arXiv:2012.00152 [Cs, Stat], November.
Fink, Daniel. 1997. β€œA Compendium of Conjugate Priors,” 46.
Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. β€œVisualization in Bayesian Workflow.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 182 (2): 389–402.
Gelman, Andrew. 2006. β€œPrior Distributions for Variance Parameters in Hierarchical Models (Comment on Article by Browne and Draper).” Bayesian Analysis 1 (3): 515–34.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3 edition. Chapman & Hall/CRC texts in statistical science. Boca Raton: Chapman and Hall/CRC.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and other stories. Cambridge, UK: Cambridge University Press.
Gelman, Andrew, and Deborah Nolan. 2017. Teaching Statistics: A Bag of Tricks. 2 edition. Oxford: Oxford University Press.
Gelman, Andrew, and Donald B. Rubin. 1995. β€œAvoiding Model Selection in Bayesian Social Research.” Sociological Methodology 25: 165–73.
Gelman, Andrew, and Cosma Rohilla Shalizi. 2013. β€œPhilosophy and the Practice of Bayesian Statistics.” British Journal of Mathematical and Statistical Psychology 66 (1): 8–38.
Gelman, Andrew, and Yuling Yao. 2021. β€œHoles in Bayesian Statistics.” Journal of Physics G: Nuclear and Particle Physics 48 (1): 014002.
Goodman, Noah, Vikash Mansinghka, Daniel Roy, Keith Bonawitz, and Daniel Tarlow. 2012. β€œChurch: A Language for Generative Models.” arXiv:1206.3255, June.
Goodrich, Ben, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Bob Carpenter, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. β€œStan : A Probabilistic Programming Language.” Journal of Statistical Software 76 (1).
Howard, R.A. 1970. β€œDecision Analysis: Perspectives on Inference, Decision, and Experimentation.” Proceedings of the IEEE 58 (5): 632–43.
Hubbard, Douglas W. 2014. How to Measure Anything: Finding the Value of Intangibles in Business. 3 edition. Hoboken, New Jersey: Wiley.
Khan, Mohammad Emtiyaz, and HΓ₯vard Rue. 2022. β€œThe Bayesian Learning Rule.” arXiv.
Li, Meng, and David B. Dunson. 2016. β€œA Framework for Probabilistic Inferences from Imperfect Models.” arXiv:1611.01241 [Stat], November.
Linden, Sander van der, and Breanne Chryst. 2017. β€œNo Need for Bayes Factors: A Fully Bayesian Evidence Synthesis.” Frontiers in Applied Mathematics and Statistics 3.
ma, wei jin, Konrad Paul Kording, and Daniel Goldreich. n.d. Bayesian Models of Perception and Action.
Mackay, David J. C. 1995. β€œProbable Networks and Plausible Predictions β€” a Review of Practical Bayesian Methods for Supervised Neural Networks.” Network: Computation in Neural Systems 6 (3): 469–505.
MacKay, David JC. 1999. β€œComparison of Approximate Methods for Handling Hyperparameters.” Neural Computation 11 (5): 1035–68.
Mandt, Stephan, Matthew D. Hoffman, and David M. Blei. 2017. β€œStochastic Gradient Descent as Approximate Bayesian Inference.” JMLR, April.
Martin, Gael M., David T. Frazier, and Christian P. Robert. 2020. β€œComputing Bayes: Bayesian Computation from 1763 to the 21st Century.” arXiv:2004.06425 [Stat], December.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and STAN. Boca Raton: CRC Press.
Raftery, Adrian E. 1995. β€œBayesian Model Selection in Social Research.” Sociological Methodology 25: 111–63.
Robert, Christian P. 2007. The Bayesian choice: from decision-theoretic foundations to computational implementation. 2nd ed. Springer texts in statistics. New York: Springer.
Schervish, Mark J. 2012. Theory of Statistics. Springer Series in Statistics. New York, NY: Springer Science & Business Media.
Schoot, Rens van de, Sarah Depaoli, Ruth King, Bianca Kramer, Kaspar MΓ€rtens, Mahlet G. Tadesse, Marina Vannucci, et al. 2021. β€œBayesian Statistics and Modelling.” Nature Reviews Methods Primers 1 (1): 1–26.
Stuart, Andrew M. 2010. β€œInverse Problems: A Bayesian Perspective.” Acta Numerica 19: 451–559.
Zellner, Arnold. 1988. β€œOptimal Information Processing and Bayes’s Theorem.” The American Statistician 42 (4): 278–80.
β€”β€”β€”. 2002. β€œInformation Processing and Bayesian Analysis.” Journal of Econometrics, Information and Entropy Econometrics, 107 (1): 41–50.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.