Inferring cause and effect from nature, especially from observational (as opposed to ideal experimental) data where it is hard.
Graphical models and related techniques for doing it.
Avoiding the danger of folk statistics.
Observational studies, confounding, adjustment criteria, *d*-separation,
identifiability, interventions, moral equivalence…
Avoidance of Ecological fallacy/
Simpson’s paradox.

The gold standard, of course, is to work out if A causes B by doing an experiment where no input but A changes, then observing B. Statistically it is nearly as good to do the experiment where all other influences apart from A are at least uncorrelated with A. In many circumstances though, (budget restrictions, ethical constraints, bad experimental design…) we cannot do these ideal experiments, and a mathematical crutch is needed to get us the next-best outcome.

The most well-trodden path in this circumstance is using
graphical models
with the additional assumption that \(A\rightarrow B\) may be read as “A causes a change in B”.
This is what you end up with if you use a Structural Equation Model (a.k.a. hierarchical models) to impose a causal structure on the observations.
The result is a particular type of graph, a *Directed Acyclic Graph* (DAG) which, informally put, summarises what can possibly affect what in your data.
C&C treatment effect estimation.

When can I use my crappy observational data, collected without a good experimental design for whatever reason, to do interventional inference? There is a lot of research in this area; I should summarise the salient bits for myself.

What can I actually identify?

OMFG Exogenous Variation! Or, Can You Find Good Nails When You Find an Indonesian Politics Hammer quotes Angus Deaton

we have at least some control over the light but choose to let it fall where it may and then proclaim that whatever it illuminates is what we were looking for all along

See also quantum causal graphical models, and the use of classical causal graphical models to eliminate hidden quantum causes.

## Learning materials

Yanir Seroussi reviews various books in Causal inference resources.

Miguel Hernán and Jamie Robins’ causal inference book (Miguel A. Hernán and Robins 2020) has a free draft online. See Yanir Seroussi’s review.

Jonas Peters’ notes from his teaching in 2015 (I may have taken this course; can’t recall exactly).

Samantha Kleinberg write two (introductory and advanced. The latter is notable for handling for time-dependent causality.

Tutorial: David Sontag and Uri Shalit, Causal inference from observational studies. Mastering Metrics: The Path from Cause to Effect A resource list for causality in statistics, data science and physics.

I speculate that in realistic causal networks or DAGs, the number of possible correlations grows faster than the number of possible causal relationships. So confounds really are that common, and since people do not think in DAGs, the imbalance also explains overconfidence.

Felix Elwert’s summary. (Elwert 2013)

Chapter 3 of (some edition of) Pearl’s book is available as an author’s preprint: Part 1, 2, 3, 4, 5, 6.

Stanford encyclopaedia of philosophy entry.

Various classic introductions (Pearl 2012, 1998; Elwert 2013; Morgan and Winship 2015; Rohrer 2018). Notably not recommended as a pedagogic experience (Koller and Friedman 2009) (although as a reference text it is great and will make you smarter).

The dagitty intro is an interactive guide via visualizations.
Likewise, the ggdag bias structure vignette
shows of the useful explanation diagrams available in `ggdag`

and is also a good introduction to selection bias and causal DAGs themselves.

Amit Sharma’s tutorial at KDD. See also Emily Riederer’s Causal design patterns for data analysts Spurious correlation induced by sampling bias.

## do-calculus

## In modern machine learning

Cunning modern nonparametric approaches such as Künzel et al. (2019) are covered in the causality notebook.

## Continuously indexed fields

More generally that the typical framing where we have a few distinct variables joined by arrows of inference, we might be concerned with continuously indexed random fields.

## External validity

See external validity.

## Potential outcomes approach

A.k.a. Neyman-Rubin school. See potential outcomes.

## Inferring a causal graph from data

Uh oh. You don’t know what causes what?
Or specifically, you can’t eliminate a whole bunch of potential causal arrows *a priori*?
Much more work.

Here is a seminar I noticed on this theme, which is also a lightspeed introduction to some difficulties.

Guido Consonni,

Objective Bayes Model Selection of Gaussian Essential Graphs with Observational and Interventional Data.Graphical models based on Directed Acyclic Graphs (DAGs) represent a powerful tool for investigating dependencies among variables. It is well known that one cannot distinguish between DAGs encoding the same set of conditional independencies (Markov equivalent DAGs) using only observational data. However, the space of all DAGs can be partitioned into Markov equivalence classes, each being represented by a unique Essential Graph (EG), also called Completed Partially Directed Graph (CPDAG). In some fields, in particular genomics, one can have both observational and interventional data, the latter being produced after an exogenous perturbation of some variables in the system, or from randomized intervention experiments. Interventions destroy the original causal structure, and modify the Markov property of the underlying DAG, leading to a finer partition of DAGs into equivalence classes, each one being represented by an Interventional Essential Graph (I-EG) (Hauser and Buehlmann). In this talk we consider Bayesian model selection of EGs under the assumption that the variables are jointly Gaussian. In particular, we adopt an objective Bayes approach, based on the notion of fractional Bayes factor, and obtain a closed form expression for the marginal likelihood of an EG. Next we construct a Markov chain to explore the EG space under a sparsity constraint, and propose an MCMC algorithm to approximate the posterior distribution over the space of EGs. Our methodology, which we name Objective Bayes Essential graph Search (OBES), allows to evaluate the inferential uncertainty associated to any features of interest, for instance the posterior probability of edge inclusion. An extension of OBES to deal simultaneously with observational and interventional data is also presented: this involves suitable modifications of the likelihood and prior, as well as of the MCMC algorithm.

## Causal time series

As with other time series methods, has its own issues.

🏗 find out how Causal impact works. (Based on Brodersen et al. (2015).)

The CausalImpact R package implements an approach to estimating the causal effect of a designed intervention on a time series. For example, how many additional daily clicks were generated by an advertising campaign? Answering a question like this can be difficult when a randomized experiment is not available. The package aims to address this difficulty using a structural Bayesian time-series model to estimate how the response metric might have evolved after the intervention if the intervention had not occurred.

More generally, we might be concerned with continuous time.

## Drawing graphical models

## Tools

Many. See, e.g. CausalDiscoveryToolbox, ijmbarr/causalgraphicalmodels: Causal Graphical Models in Python dagR does R. Recent and backed by Microsoft, DoWhy is a python toolbox.

## References

*Statistical Methods in Medical Research*25 (5): 2294–314.

*PMLR*.

*Physical Review X*7 (3): 031021.

*arXiv:1703.04025 [Cs, Stat]*, March.

*Proceedings of the National Academy of Sciences*106 (51): 21544–49.

*arXiv:1907.02893 [Cs, Stat]*, March.

*Conditional Specification of Statistical Models*. Springer Science & Business Media.

*arXiv:1702.02604 [Cs, Stat]*, February.

*Proceedings of the National Academy of Sciences*113 (27): 7345–52.

*AAAI*, 2410–16.

*arXiv:1812.03253 [Cs, Stat]*.

*Uncertainty in Artificial Intelligence*, 585–94. PMLR.

*arXiv:1507.03652 [Math, Stat]*, July.

*Proceedings of the 27th ACM International Conference on Information and Knowledge Management*, 1003–12. CIKM ’18. New York, NY, USA: ACM.

*arXiv:1611.06221 [Cs, Stat]*, October.

*arXiv:1803.08784 [Cs, Stat]*, March.

*arXiv:1611.06221 [Cs, Stat]*, November.

*arXiv:1209.2355 [Cs, Math, Stat]*, July.

*Scientific Reports*6 (1): 27538.

*The Annals of Applied Statistics*9 (1): 247–74.

*Mathematical Methods of Operations Research*77 (3): 357–70.

*Statistical Science*35 (3): 404–26.

*Annual Review of Statistics and Its Application*1 (1): 255–78.

*Statistical Methods in Medical Research*22 (5): 466–92.

*Neural Computation*24 (7): 1611–68.

*Physical Review Letters*120 (19): 190401.

*arXiv:1411.1557 [Stat]*, November.

*The Annals of Statistics*40 (1): 294–321.

*Journal of Causal Inference*9 (1): 39–77.

*Biometrika*, October, asr041.

*NIPS Causality: Objectives and Assessment*, 177–90.

*Granger-Causality Graphs for Multivariate Time Series*.

*Handbook of Causal Analysis for Social Research*, edited by Stephen L. Morgan, 245–73. Handbooks of Sociology and Social Research. Dordrecht: Springer Netherlands.

*Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics*, 256–64.

*arXiv:1405.1868 [Stat]*, May.

*arXiv:2104.04103 [Cs, Stat]*, September.

*Games for the superintelligent*. London: Muller.

*The Astrophysical Journal*806 (2): 215.

*Journal of the American Statistical Association*108 (501): 288–300.

*Synthese*197 (4): 1467–86.

*American Journal of Sociology*117 (3): 955–66.

*Applied Bayesian Modeling and Causal Inference From Incomplete-Data Perspectives*. John Wiley & Sons.

*arXiv:2010.12237 [Cs]*, October.

*Annual Review of Statistics and Its Application*6 (1): 103–24.

*arXiv:1403.2310 [Stat]*, March.

*Electronic Journal of Probability*19.

*Annals of Epidemiology*26 (10): 674–80.

*American Journal of Public Health*108 (5): 616–19.

*Causal Inference: What If*.

*Causal Inference Vol 3*.

*Causal Inference Vol 2*.

*Causal Inference Vol 1*.

*Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics*, 128–35. Citeseer.

*Advances in Neural Information Processing Systems 21*, edited by D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, 689–96. Curran Associates, Inc.

*arXiv:2012.08154 [Cs, Stat]*, December.

*The Annals of Statistics*49 (3): 1460–88.

*Artificial Intelligence*182-183 (May): 1–31.

*IEEE Transactions on Information Theory*56 (10): 5168–94.

*arXiv:1605.03661 [Cs, Stat]*, June.

*International Conference on Machine Learning*, 3020–29. PMLR.

*Learning in Graphical Models*. Cambridge, Mass.: MIT Press.

*The Handbook of Brain Theory and Neural Networks*, 490–96.

*Handbook of Neural Networks and Brain Theory*.

*Journal of Machine Learning Research*21 (37): 1–5.

*Journal of Machine Learning Research*8 (May): 613–36.

*Journal of Machine Learning Research*21 (62): 1–54.

*arXiv Preprint arXiv:1510.04740*.

*Advances in Neural Information Processing Systems 30*, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 656–66. Curran Associates, Inc.

*IJCAI*, 83:190–93. San Francisco, CA, USA: Citeseer.

*Causality, Probability, and Time*. 1 edition. Cambridge: Cambridge University Press.

*Why: A Guide to Finding and Using Causes*. 1st edition. Beijing ; Boston: O’Reilly Media.

*arXiv:1709.02023 [Cs, Math, Stat]*, September.

*Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing*. 1st edition. Cambridge, United Kingdom ; New York, NY: Cambridge University Press.

*Annual Review of Statistics and Its Application*6 (1): 149–72.

*Probabilistic Graphical Models : Principles and Techniques*. Cambridge, MA: MIT Press.

*Proceedings of the National Academy of Sciences*116 (10): 4156–65.

*Journal of the Royal Statistical Society. Series B (Methodological)*50 (2): 157–224.

*Graphical Models*. Oxford Statistical Science Series. Clarendon Press.

*Complex Stochastic Systems*, 63–107. CRC Press.

*arXiv:1811.12359 [Cs, Stat]*, June.

*arXiv:1605.08179 [Cs, Stat]*, May.

*Advances in Neural Information Processing Systems 30*, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 6446–56. Curran Associates, Inc.

*arXiv Preprint arXiv:1307.5636*.

*Nature Methods*7 (4): 247–48.

*The Annals of Statistics*37 (6A): 3133–64.

*arXiv:1903.03662 [Stat]*, March.

*Proceedings of the National Academy of Sciences*107 (14): 6286–91.

*2018 IEEE Data Science Workshop (DSW)*, 6–10.

*New England Journal of Medicine*367 (16): 1562–64.

*Proceedings of the 24th International Conference on Machine Learning*, 625–32. ACM.

*Uai2018*, 17.

*Journal of Machine Learning Research*17 (32): 1–102.

*Counterfactuals and Causal Inference*. Cambridge University Press.

*Machine learning: a probabilistic perspective*. 1 edition. Adaptive computation and machine learning series. Cambridge, MA: MIT Press.

*arXiv:1911.06030 [Stat]*, November.

*Learning Bayesian Networks*. Vol. 38. Prentice Hal, Paperback.

*arXiv:1910.08527 [Cs, Stat]*, February.

*Advances In Neural Information Processing Systems*.

*Social Networks*33 (3): 211–18.

*arXiv:2110.10819 [Cs]*, October.

*In Proceedings of the National Conference on Artificial Intelligence*, 133–36.

*Artificial Intelligence*29 (3): 241–88.

*Quantified Representation of Uncertainty and Imprecision*, edited by Philippe Smets, 367–89. Handbook of Defeasible Reasoning and Uncertainty Management Systems. Dordrecht: Springer Netherlands.

*Probabilistic reasoning in intelligent systems: networks of plausible inference*. Rev. 2. print., 12. [Dr.]. The Morgan Kaufmann series in representation and reasoning. San Francisco, Calif: Kaufmann.

*Statistics Surveys*3: 96–146.

*Causality: Models, Reasoning and Inference*. Cambridge University Press.

*Sociological Methodology*40 (1): 75–149.

*Statistical Science*29 (4): 579–95.

*Causal Inference in Statistics: A Primer*. Wiley.

*arXiv:1501.01332 [Stat]*, January.

*Elements of Causal Inference: Foundations and Learning Algorithms*. Adaptive Computation and Machine Learning Series. Cambridge, Massachuestts: The MIT Press.

*The Journal of Machine Learning Research*15 (1): 2009–53.

*2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton)*, 958–65.

*Proceedings of the 27th ACM International Conference on Information and Knowledge Management*, 1679–82. CIKM ’18. New York, NY, USA: Association for Computing Machinery.

*Current Epidemiology Reports*3 (1): 63–71.

*Annals of Statistics*30 (4): 962–1030.

*Latent Variable Modeling and Applications to Causality*, edited by Maia Berkane, 69–117. Lecture Notes in Statistics. New York, NY: Springer.

*Advances in Methods and Practices in Psychological Science*, January.

*arXiv:1801.06229 [Stat]*, May.

*Journal of Machine Learning Research*21 (188): 1–86.

*Uncertainty in Artificial Intelligence*.

*arXiv:1707.00819 [Cs, Stat]*, July.

*Statistical Science*21 (2): 206–22.

*Use of Directed Acyclic Graphs*. Agency for Healthcare Research and Quality (US).

*arXiv:1911.10500 [Cs, Stat]*, December.

*ICML 2012*.

*arXiv:1505.03036 [Astro-Ph, Stat]*, May.

*Proceedings of the IEEE*109 (5): 612–34.

*arXiv:1501.06794 [Cs, Stat]*, January.

*Proceedings of the 31st International Conference on Neural Information Processing Systems*, 1696–706. NIPS’17. Red Hook, NY, USA: Curran Associates Inc.

*arXiv:1607.06565 [Physics, Stat]*, July.

*Sociological Methods & Research*40 (2): 211–39.

*Cause and Correlation in Biology: A User’s Guide to Path Analysis, Structural Equations and Causal Inference with R*. 2nd ed. Cambridge: Cambridge University Press.

*The Journal of Machine Learning Research*9: 1941–79.

*arXiv:1411.2127 [Stat]*, November.

*BMC Medical Research Methodology*8 (1): 70.

*arXiv:2007.04511 [Stat]*, July.

*Proceedings of the Conference on Empirical Methods in Natural Language Processing*, 145–56. Association for Computational Linguistics.

*Causation, Prediction, and Search*. Second Edition. Adaptive Computation and Machine Learning. The MIT Press.

*The 22nd International Conference on Artificial Intelligence and Statistics*, 3118–27. PMLR.

*arXiv:1508.00280 [Cs]*, August.

*Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence*, 681–88. UAI’11. Arlington, Virginia, USA: AUAI Press.

*arXiv:1710.05899 [Cs]*, July.

*arXiv:1807.04010 [Cs, Stat]*, July.

*Statistical Methods in Medical Research*21 (1): 7–30.

*arXiv:1407.2483 [Cs, Stat]*, July.

*Evolution*68 (7): 2128–36.

*arXiv:1710.02428 [Astro-Ph]*, October.

*arXiv:2002.06060 [q-Bio, Stat]*, July.

*arXiv:2007.10979 [Cs, Stat]*, July.

*The Annals of Mathematical Statistics*5 (3): 161–215.

*arXiv:1611.04660 [Cs, Stat]*, November.

*arXiv:2004.08697 [Cs, Stat]*, July.

*Exploring Artificial Intelligence in the New Millennium*, edited by G. Lakemeyer and B. Nebel, 239–36. Morgan Kaufmann Publishers.

*Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence*, 3315–21. AAAI’16. Phoenix, Arizona: AAAI Press.

*Proceedings of the UAI 2014 Conference on Causal Inference: Learning and Prediction - Volume 1274*, 11–24. CI’14. Aachen, DEU: CEUR-WS.org.

*Proceedings of the 24th International Conference on Artificial Intelligence*, 3243–49. IJCAI’15. Buenos Aires, Argentina: AAAI Press.

*arXiv:1202.3775 [Cs, Stat]*, February.

## No comments yet. Why not leave one?