Doubly robust learning for causal inference
TMLE, debiassed ML, Xlearners, Neyman learning, Targeted learning
September 18, 2020 — May 27, 2024
An area of causal learning and in particular MLstyle causal learning, about which I should learn. It looks a lot like instrumental variables regression, except that the latter is usually presented in a strictly linear context.
In the mean time, there are many papers I have been recommended. Probably I should start from recent reviews such as Guo et al. (2020), Kennedy (2023), Funk et al. (2011) or Chernozhukov et al. (2017).
I was introduced to this area from Künzel et al. (2019) by Mike McKenna. That paper introduces a generic intervention estimator for ML methods.
We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the Xlearner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitzcontinuous, the Xlearner can still achieve the parametric rate under regularity conditions. We then introduce versions of the Xlearner that use RF and BART as base learners. In extensive simulation studies, the Xlearner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our Xlearner can be used to target treatment regimes and to shed light on underlying mechanisms.
See also Mishler and Kennedy (2021). Maybe related: Shalit, Johansson, and Sontag (2017), Shi, Blei, and Veitch (2019).
1 Tooling
1.1 “Generalized” random forests
generalized random forests (Athey, Tibshirani, and Wager 2019) (implementation) describe themselves:
GRF extends the idea of a classic random forest to allow for estimating other statistical quantities besides the expected outcome. Each forest type, for example
quantile_forest
, trains a random forest targeted at a particular problem, like quantile estimation. The most common use of GRF is in estimating treatment effects through the functioncausal_forest
.
1.2 EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines stateoftheart machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) \(t\) on an outcome variable \(y\), controlling for a set of features \(x\).