Doubly robust learning for causal inference
TMLE, debiassed ML, X-learners, Neyman learning, Targeted learning
September 18, 2020 — May 27, 2024
An area of causal learning and in particular ML-style causal learning, about which I should learn. It looks a lot like instrumental variables regression, except that the latter is usually presented in a strictly linear context.
I was introduced to this area by Künzel et al. (2019) (thanks to Mike McKenna). That paper introduces a generic intervention estimator for ML methods.
We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favourably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms.
Since then, many papers have been recommended to me. Probably I should start from recent reviews such as Guo et al. (2020), Kennedy (2023), Funk et al. (2011) or Chernozhukov et al. (2017).
See also Mishler and Kennedy (2021). Maybe related: Shalit, Johansson, and Sontag (2017), Shi, Blei, and Veitch (2019).
1 Tooling
1.1 “Generalized” random forests
generalized random forests (Athey, Tibshirani, and Wager 2019) (implementation) describe themselves:
GRF extends the idea of a classic random forest to allow for estimating other statistical quantities besides the expected outcome. Each forest type, for example
quantile_forest
, trains a random forest targeted at a particular problem, like quantile estimation. The most common use of GRF is in estimating treatment effects through the functioncausal_forest
.
1.2 EconML
-
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) \(t\) on an outcome variable \(y\), controlling for a set of features \(x\).