Applying a causal graph structure in the challenging environment of a no-holds-barred nonparametric machine learning algorithm such as a neural net or its ilk. I am interested in this because it seems necessary and kind of obvious for handling things like dataset shift, but is often ignored. What is that about?

I do not know at the moment. This is a link salad for now.

LĂ©on Bottou, From Causal Graphs to Causal Invariance:

For many problems, itâ€™s difficult to even attempt drawing a causal graph. While structural causal models provide a complete framework for causal inference, it is often hard to encode known physical laws (such as Newtonâ€™s gravitation, or the ideal gas law) as causal graphs. In familiar machine learning territory, how does one model the causal relationships between individual pixels and a target prediction? This is one of the motivating questions behind the paper Invariant Risk Minimization (IRM). In place of structured graphs, the authors elevate invariance to the defining feature of causality.

Nisha Muktewar and Chris Wallace, Causality for Machine Learning is the book Bottou recommends on this theme.

For coders, Ben Dickson writes on Why machine learning struggles with causality.

KĂĽnzel et al. (2019) (HT Mike McKenna) looks interesting - it is a generic intervention estimator for ML methods.

â€¦ We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithmsâ€”such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networksâ€”to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms.

There is a fun body of work by what is in my mind the Central European causality-ML thinktank which includes various interesting people: Bernhard SchĂ¶lkopf, Jonas Peters, Joris Mooij, Stephan Bongers and Dominik Janzing Eetc. I would love to understand everything that is going on here. Perhaps I should start with the book (Peters, Janzing, and SchĂ¶lkopf 2017) (Free PDF), or the chatty casual introduction (SchĂ¶lkopf 2019).

For a good explanation of what they are about by example, see Bernhard SchĂ¶lkopf: Causality and Exoplanets.

I am particularly curious about their work in causality in continuous fields, e.g. Bongers et al. (2020); Bongers and Mooij (2018); Bongers et al. (2016); Rubenstein et al. (2018).

## References

*arXiv:1907.02893 [cs, Stat]*, March. http://arxiv.org/abs/1907.02893.

*arXiv:1812.03253 [cs, Stat]*. http://arxiv.org/abs/1812.03253.

*arXiv:1611.06221 [cs, Stat]*, October. http://arxiv.org/abs/1611.06221.

*arXiv:1803.08784 [cs, Stat]*, March. http://arxiv.org/abs/1803.08784.

*arXiv:1611.06221 [cs, Stat]*, November. http://arxiv.org/abs/1611.06221.

*arXiv:2009.09070 [cs]*, September. http://arxiv.org/abs/2009.09070.

*arXiv:1909.10893 [cs, Stat]*, November. http://arxiv.org/abs/1909.10893.

*Advances in Neural Information Processing Systems 29*, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 2946â€“54. Curran Associates, Inc. http://papers.nips.cc/paper/6379-composing-graphical-models-with-neural-networks-for-structured-representations-and-fast-inference.pdf.

*arXiv:1709.02023 [cs, Math, Stat]*, September. http://arxiv.org/abs/1709.02023.

*Proceedings of the National Academy of Sciences*116 (10): 4156â€“65. https://doi.org/10.1073/pnas.1804597116.

*arXiv:2006.07796 [cs, Stat]*, July. http://arxiv.org/abs/2006.07796.

*arXiv:1811.12359 [cs, Stat]*, June. http://arxiv.org/abs/1811.12359.

*Advances in Neural Information Processing Systems 30*, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 6446â€“56. Curran Associates, Inc. http://papers.nips.cc/paper/7223-causal-effect-inference-with-deep-latent-variable-models.pdf.

*arXiv:2102.12353 [cs, Stat]*, June. http://arxiv.org/abs/2102.12353.

*arXiv:1412.3773 [cs, Stat]*, December. http://arxiv.org/abs/1412.3773.

*arXiv:1910.08527 [cs, Stat]*, February. http://arxiv.org/abs/1910.08527.

*Advances In Neural Information Processing Systems*. http://arxiv.org/abs/1911.07420.

*Elements of Causal Inference: Foundations and Learning Algorithms*. Adaptive Computation and Machine Learning Series. Cambridge, Massachuestts: The MIT Press. https://www.dropbox.com/s/dl/gkmsow492w3oolt/11283.pdf.

*Proceedings of the 27th ACM International Conference on Information and Knowledge Management*, 1679â€“82. CIKM â€™18. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3269206.3269267.

*Journal of Machine Learning Research*21 (188): 1â€“86. http://jmlr.org/papers/v21/19-1026.html.

*Uncertainty in Artificial Intelligence*. http://arxiv.org/abs/1608.08028.

*arXiv:1911.10500 [cs, Stat]*, December. http://arxiv.org/abs/1911.10500.

*Proceedings of the IEEE*109 (5): 612â€“34. https://doi.org/10.1109/JPROC.2021.3058954.

*arXiv:2109.03795 [cs, Stat]*, September. http://arxiv.org/abs/2109.03795.

*arXiv:2004.08697 [cs, Stat]*, July. http://arxiv.org/abs/2004.08697.

*Advances in Neural Information Processing Systems*. Vol. 33. https://arxiv.org/abs/2002.03278.

*arXiv:2010.07684 [cs]*, February. http://arxiv.org/abs/2010.07684.

## No comments yet. Why not leave one?