# Causal inference on DAGs

Confounding! This scientist performed a miracle graph surgery intervention and you won’t believe what happened next

October 26, 2016 — August 6, 2022

algebra
graphical models
how do science
machine learning
networks
probability
statistics

Making valid statistical inference, in the sense of making inference that is compatible with our understanding of the causal relationships that exist in the world (not just the correlations in our data). Graphical models and related techniques for doing it. Avoiding the danger of folk statistics. Observational studies, confounding, adjustment criteria, d-separation, identifiability, interventions, moral equivalence… Avoidance of Ecological fallacy/ Simpson’s paradox.

The gold standard, of course, is to work out if A causes B by doing an experiment where no input but A changes, then observing B, which is what a controlled trial is. In practice this is unattainable because it would usually require cloning the entire state of the universe and running multiple copies in parallel. Statistically it can be nearly as good to do the experiment where we change A and all other influences apart from are at least uncorrelated with A, which is more usually what we do — a randomised controlled trial. In many circumstances though, (budget restrictions, ethical constraints, bad experimental design…) we cannot do these ideal experiments, and a mathematical crutch is needed to get us the next-best outcome, which is to control for the things that we must (and not control for the things we must not).

In classic-flavoured causal inference, we use graphical models with the additional assumption that $$A\rightarrow B$$ may be read as “A causes a change in B”. This is what you end up with if you use a Structural Equation Model (a.k.a. hierarchical models) to impose a causal structure on the observations. The result is a particular type of graph, a Directed Acyclic Graph_ (DAG) which, informally put, summarises what can possibly affect what in the model. Slightly more formally, it summarises what cannot (conditionally) affect what. C&C conditional treatment effect estimation by potential outcomes.

With this tool in hand I can answer the question of when I can use my crappy observational data, collected without a good experimental design for whatever reason, to do interventional inference? There is a lot of research in this area; I should summarise the salient bits for myself.

## 1 What can go wrong?

What can I actually identify? For a start, if we are resorting to this more difficult methodology that already suggests that we might be trying to use data which was collected with no regard to our actual statistical needs, and thus we might really stretch to imagine that we can actually find actual appropriate instruments in the data. Here is an essay on that theme.

• we have at least some control over the light but choose to let it fall where it may and then proclaim that whatever it illuminates is what we were looking for all along

## 2 Teaching

Yanir Seroussi’s Causal inference resources recommends

Miguel Hernán and Jamie Robins’ causal inference book is available in free draft form online. See Yanir Seroussi’s review.

Jonas Peters’ notes from his teaching in 2015 (I think I took this course).

Samantha Kleinberg wrote two classes, introductory and advanced. The latter is notable for handling for time-dependent causality.

Tutorial: David Sontag and Uri Shalit, Causal inference from observational studies. Mastering Metrics: The Path from Cause to Effect A resource list for causality in statistics, data science and physics.

Brady Neal’s Introduction to Causal Inference includes his draft textbook.

Chapter 3 of (some edition of) Pearl’s book is available as an author’s preprint: Part 1, 2, 3, 4, 5, 6.

Various classic introductions . Notably not recommended as a pedagogic experience (although as a reference text it is great and will make you smarter).

The dagitty intro is an interactive guide via visualizations. Likewise, the ggdag bias structure vignette shows of the useful explanation diagrams available in ggdag and is also a good introduction to selection bias and causal DAGs themselves.

Still confused? Overwhelmed? I am. How about a diagram?

## 4 References

Aalen, Røysland, Gran, et al. 2016. Statistical Methods in Medical Research.
Achab, Bacry, Gaïffas, et al. 2017. In PMLR.
Allen, Barrett, Horsman, et al. 2017. Physical Review X.
Aragam, Gu, and Zhou. 2017. arXiv:1703.04025 [Cs, Stat].
Aral, Muchnik, and Sundararajan. 2009. Proceedings of the National Academy of Sciences.
Arjovsky, Bottou, Gulrajani, et al. 2020.
Arnold, Castillo, and Sarabia. 1999. Conditional Specification of Statistical Models.
Athey, and Wager. 2019. arXiv:1902.07409 [Stat].
Bahadori, Chalupka, Choi, et al. 2017. arXiv:1702.02604 [Cs, Stat].
Bareinboim, and Pearl. 2016. Proceedings of the National Academy of Sciences.
Bareinboim, Tian, and Pearl. 2014. In AAAI.
Barnum, Barrett, Clark, et al. 2010. New Journal of Physics.
Besserve, Mehrjou, Sun, et al. 2019. In arXiv:1812.03253 [Cs, Stat].
Blom, Bongers, and Mooij. 2020. In Uncertainty in Artificial Intelligence.
Blom, and Mooij. 2020. “Robust Model Predictions via Causal Ordering.” In.
Bloniarz, Liu, Zhang, et al. 2015. arXiv:1507.03652 [Math, Stat].
Bonchi, Gullo, Mishra, et al. 2018. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM ’18.
Bongers, Forré, Peters, et al. 2020. arXiv:1611.06221 [Cs, Stat].
Bongers, and Mooij. 2018. arXiv:1803.08784 [Cs, Stat].
Bongers, Peters, Schölkopf, et al. 2016. arXiv:1611.06221 [Cs, Stat].
Bottou, Peters, Quiñonero-Candela, et al. 2013. arXiv:1209.2355 [Cs, Math, Stat].
Braunstein, and Ingrosso. 2016. Scientific Reports.
Bright, Malinsky, and Thompson. 2016. Philosophy of Science.
Brito, and Pearl. 2002. Structural Equation Modeling: A Multidisciplinary Journal.
———. 2012. arXiv:1301.0560 [Cs].
Brodersen, Gallusser, Koehler, et al. 2015. The Annals of Applied Statistics.
Bühlmann. 2013. Mathematical Methods of Operations Research.
———. 2020. Statistical Science.
Bühlmann, Kalisch, and Meier. 2014. Annual Review of Statistics and Its Application.
Bühlmann, Peters, Ernest, et al. 2014.
Bühlmann, Rütimann, and Kalisch. 2013. Statistical Methods in Medical Research.
Chalak, and White. 2012. Neural Computation.
Chau, Ton, González, et al. 2021.
Chaves, Lemos, and Pienaar. 2018. Physical Review Letters.
Chen, and Pearl. 2012. “Regression and Causation: A Critical Examination of Econometric Textbooks.”
Christiansen, Pfister, Jakobsen, et al. 2020.
Claassen, Mooij, and Heskes. 2014. arXiv:1411.1557 [Stat].
Colombo, Maathuis, Kalisch, et al. 2012. The Annals of Statistics.
Cornish, Taufiq, Doucet, et al. 2023.
Dash. 2003.
Dawid. 2021. Journal of Causal Inference.
De Luna, Waernbaum, and Richardson. 2011. Biometrika.
Duvenaud, Eaton, Murphy, et al. 2010. In NIPS Causality: Objectives and Assessment.
Eichler. 2001. Granger-Causality Graphs for Multivariate Time Series.
Elwert. 2013. In Handbook of Causal Analysis for Social Research. Handbooks of Sociology and Social Research.
Entner, Hoyer, and Spirtes. 2013. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics.
Ernest, and Bühlmann. 2014. arXiv:1405.1868 [Stat].
Fernández-Loría, and Provost. 2021. arXiv:2104.04103 [Cs, Stat].
Fixx. 1977. Games for the superintelligent.
Foreman-Mackey, Montet, Hogg, et al. 2015. The Astrophysical Journal.
Fu, and Zhou. 2013. Journal of the American Statistical Association.
Gebharter, and Retzlaff. 2020. Synthese.
Gelman. 2010. American Journal of Sociology.
Gelman, and Meng. 2004. Applied Bayesian Modeling and Causal Inference From Incomplete-Data Perspectives.
Gendron, Witbrock, and Dobbie. 2023.
Genewein, McGrath, Déletang, et al. 2020. arXiv:2010.12237 [Cs].
Geng, Liu, Liu, et al. 2019. Annual Review of Statistics and Its Application.
Glymour. 1998. Philosophy of Science.
Gu, Fu, and Zhou. 2014. arXiv:1403.2310 [Stat].
Guo, Tóth, Schölkopf, et al. 2022.
Hansen, and Sokol. 2014. Electronic Journal of Probability.
Hernán, Miguel A. 2016. Annals of Epidemiology.
———. 2018. American Journal of Public Health.
Hernán, Miguel, and Robins. 2019a. Causal Inference Vol 3.
———. 2019b. Causal Inference Vol 2.
———. 2019c. Causal Inference Vol 1.
Hernán, Miguel A, and Robins. 2020. Causal Inference: What If.
Hinton, Osindero, and Bao. 2005. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics.
Hoyer, Janzing, Mooij, et al. 2009. In Advances in Neural Information Processing Systems 21.
Huang, and Kleinberg. 2015. In.
Hult, and Zachariah. 2020. arXiv:2012.08154 [Cs, Stat].
Hyttinen, Eberhardt, and Järvisalo. n.d.
Imbens, and Menzel. 2021. The Annals of Statistics.
Janzing, Mooij, Zhang, et al. 2012. Artificial Intelligence.
Janzing, and Schölkopf. 2010. IEEE Transactions on Information Theory.
Johansson, Fredrik, Shalit, and Sontag. 2016. In International Conference on Machine Learning.
Johansson, Fredrik D., Shalit, and Sontag. 2018. arXiv:1605.03661 [Cs, Stat].
Jordan, Michael Irwin. 1999. Learning in Graphical Models.
Jordan, Michael I., Wang, and Zhou. 2022.
Jordan, Michael I., and Weiss. 2002a. The Handbook of Brain Theory and Neural Networks.
———. 2002b. Handbook of Neural Networks and Brain Theory.
Kalainathan, Goudet, and Dutta. 2020. Journal of Machine Learning Research.
Kalisch, and Bühlmann. 2007. Journal of Machine Learning Research.
Kallus. 2020. Journal of Machine Learning Research.
Kennedy. 2015. arXiv Preprint arXiv:1510.04740.
Kilbertus, Rojas Carulla, Parascandolo, et al. 2017. In Advances in Neural Information Processing Systems 30.
Kim, and Pearl. 1983. In IJCAI.
Kleinberg. 2012. Causality, Probability, and Time.
———. 2015. Why: A Guide to Finding and Using Causes.
Kocaoglu, Snyder, Dimakis, et al. 2017. arXiv:1709.02023 [Cs, Math, Stat].
Kohavi, Tang, and Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing.
Kohler, Kreuter, and Stuart. 2019. Annual Review of Statistics and Its Application.
Koller, and Friedman. 2009. Probabilistic Graphical Models : Principles and Techniques.
Künzel, Sekhon, Bickel, et al. 2019. Proceedings of the National Academy of Sciences.
Lauritzen, Steffen L. 1996. Graphical Models. Oxford Statistical Science Series.
———. 2000. In Complex Stochastic Systems.
Lauritzen, S. L., and Spiegelhalter. 1988. Journal of the Royal Statistical Society. Series B (Methodological).
Lee, and Bareinboim. 2021. In.
———. n.d. “Causal Effect Identifiability Under Partial-Observability.”
Locatello, Bauer, Lucic, et al. 2019. arXiv:1811.12359 [Cs, Stat].
Lopez-Paz, Nishihara, Chintala, et al. 2016. arXiv:1605.08179 [Cs, Stat].
Louizos, Shalit, Mooij, et al. 2017. In Advances in Neural Information Processing Systems 30.
Maathuis, and Colombo. 2013. arXiv Preprint arXiv:1307.5636.
Maathuis, Colombo, Kalisch, et al. 2010. Nature Methods.
Maathuis, Kalisch, and Bühlmann. 2009. The Annals of Statistics.
Malinsky, Shpitser, and Richardson. 2019. arXiv:1903.03662 [Stat].
Marbach, Prill, Schaffter, et al. 2010. Proceedings of the National Academy of Sciences.
Meinshausen. 2018. In 2018 IEEE Data Science Workshop (DSW).
Messerli. 2012. New England Journal of Medicine.
Mihalkova, and Mooney. 2007. In Proceedings of the 24th International Conference on Machine Learning.
Mogensen, Malinsky, and Hansen. 2018. In UAI2018.
Montanari. 2011.
Mooij, Peters, Janzing, et al. 2016. Journal of Machine Learning Research.
Morgan, and Winship. 2015. Counterfactuals and Causal Inference.
Msaouel. 2022. Cancer Investigation.
Murphy. 2012. Machine learning: a probabilistic perspective. Adaptive computation and machine learning series.
Murray, Swanson, and Hernán. 2019. arXiv:1911.06030 [Stat].
Neal. 2020. Course Lecture Notes (Draft).
Neapolitan. 2003. Learning Bayesian Networks.
Ng, Fang, Zhu, et al. 2020. arXiv:1910.08527 [Cs, Stat].
Ng, Zhu, Chen, et al. 2019. In Advances In Neural Information Processing Systems.
Nilsson, Bonander, Strömberg, et al. 2021. International Journal of Epidemiology.
Noel, and Nyhan. 2011. Social Networks.
Ortega, Kunesch, Delétang, et al. 2021. arXiv:2110.10819 [Cs].
Pawlowski, Paterek, Kaszlikowski, et al. 2009. Nature.
Pearl. 1982. In Proceedings of the Second AAAI Conference on Artificial Intelligence. AAAI’82.
———. 1986. Artificial Intelligence.
———. 1998. In Quantified Representation of Uncertainty and Imprecision. Handbook of Defeasible Reasoning and Uncertainty Management Systems.
———. 2008. Probabilistic reasoning in intelligent systems: networks of plausible inference. The Morgan Kaufmann series in representation and reasoning.
———. 2009a. Statistics Surveys.
———. 2009b. Causality: Models, Reasoning and Inference.
———. 2010. Sociological Methodology.
———. 2011.
———. 2012. “The Do-Calculus Revisited Judea Pearl Keynote Lecture, August 17, 2012 UAI-2012 Conference, Catalina, CA.” Edited by Nando de Freitas and Kevin Murphy.
Pearl, and Bareinboim. 2014. Statistical Science.
Pearl, Glymour, and Jewell. 2016. Causal Inference in Statistics: A Primer.
Peters. 2015.
Peters, Bühlmann, and Meinshausen. 2015. arXiv:1501.01332 [Stat].
Peters, Janzing, and Schölkopf. 2017. Elements of Causal Inference: Foundations and Learning Algorithms. Adaptive Computation and Machine Learning Series.
Peters, Mooij, Janzing, et al. 2014. “Causal Discovery with Continuous Additive Noise Models.” The Journal of Machine Learning Research.
Raginsky. 2011. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
Rakesh, Guo, Moraffah, et al. 2018. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM ’18.
Rehkopf, Glymour, and Osypuk. 2016. Current Epidemiology Reports.
Richardson, Thomas S., and Robins. 2013.
Richardson, Thomas, and Spirtes. 2002. Annals of Statistics.
Robins. 1997. In Latent Variable Modeling and Applications to Causality. Lecture Notes in Statistics.
Rohrer. 2018. Advances in Methods and Practices in Psychological Science.
Rothenhäusler, Meinshausen, Bühlmann, et al. 2020. arXiv:1801.06229 [Stat].
Rotnitzky, and Smucler. 2020. Journal of Machine Learning Research.
Rubenstein, Bongers, Schölkopf, et al. 2018. In Uncertainty in Artificial Intelligence.
Rubenstein, Weichwald, Bongers, et al. 2017. arXiv:1707.00819 [Cs, Stat].
Rubin, and Waterman. 2006. Statistical Science.
Sauer, and VanderWeele. 2013. Use of Directed Acyclic Graphs.
Schölkopf. 2022. In Probabilistic and Causal Inference: The Works of Judea Pearl.
Schölkopf, Hogg, Wang, et al. 2015. arXiv:1505.03036 [Astro-Ph, Stat].
Schölkopf, Janzing, Peters, et al. 2012. In ICML 2012.
Schölkopf, Locatello, Bauer, et al. 2021. Proceedings of the IEEE.
Schölkopf, Muandet, Fukumizu, et al. 2015. arXiv:1501.06794 [Cs, Stat].
Schulam, and Saria. 2017. In Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17.
Shalizi, and McFowland III. 2016. arXiv:1607.06565 [Physics, Stat].
Shalizi, and Thomas. 2011. Sociological Methods & Research.
Sharma, and Kiciman. 2020.
Shpitser, and Pearl. 2008. “Complete Identification Methods for the Causal Hierarchy.” The Journal of Machine Learning Research.
Shpitser, and Tchetgen. 2014. arXiv:1411.2127 [Stat].
Shrier, and Platt. 2008. BMC Medical Research Methodology.
Smith, David A., and Eisner. 2008. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Smith, Bonnie, Ogburn, McGue, et al. 2020. arXiv:2007.04511 [Stat].
Spirtes, Glymour, and Scheines. 2001. Causation, Prediction, and Search. Adaptive Computation and Machine Learning.
Subbaswamy, Schulam, and Saria. 2019. In The 22nd International Conference on Artificial Intelligence and Statistics.
Suzuki, Shinozaki, and Yamamoto. 2020. Journal of Epidemiology.
Textor, Idelberger, and Liśkiewicz. 2015. arXiv:1508.00280 [Cs].
Textor, and Liśkiewicz. 2011. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence. UAI’11.
Tschantz, Sen, and Datta. 2019. arXiv:1710.05899 [Cs].
Tu, Zhang, Ackermann, et al. 2018. arXiv:1807.04010 [Cs, Stat].
van der Zander, and Liśkiewicz. 2016. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI’16.
van der Zander, Liśkiewicz, and Textor. 2014. In Proceedings of the UAI 2014 Conference on Causal Inference: Learning and Prediction - Volume 1274. CI’14.
van der Zander, Textor, and Liskiewicz. 2015. In Proceedings of the 24th International Conference on Artificial Intelligence. IJCAI’15.
Vansteelandt, Bekaert, and Claeskens. 2012. Statistical Methods in Medical Research.
Veitch, and Zaveri. 2020.
Visweswaran, and Cooper. 2014. arXiv:1407.2483 [Cs, Stat].
Wang, Dun, Hogg, Foreman-Mackey, et al. 2017. arXiv:1710.02428 [Astro-Ph].
Wang, Yuhao, Solus, Yang, et al. 2017.
Weichwald, and Peters. 2020. arXiv:2002.06060 [q-Bio, Stat].
Westfall, and Yarkoni. 2016. PLOS ONE.
Wong. 2020. arXiv:2007.10979 [Cs, Stat].
Wright. 1934. The Annals of Mathematical Statistics.
Yadav, Prunelli, Hoff, et al. 2016. arXiv:1611.04660 [Cs, Stat].
Yang, Liu, Chen, et al. 2020. arXiv:2004.08697 [Cs, Stat].
Yedidia, Freeman, and Weiss. 2003. In Exploring Artificial Intelligence in the New Millennium.
Zhang, Peters, Janzing, et al. 2012. arXiv:1202.3775 [Cs, Stat].
Zheng, Aragam, Ravikumar, et al. 2018. In Advances in Neural Information Processing Systems 31.