External validity

Transfer learning, dataset shift, learning under covariate shift, transferable learning, domain adaptation etc


This Maori gentleman from the 1800s demonstrates an artful transfer learning from the western fashion domain

One could read Sebastian Ruder’s NN-style introduction to “transfer learning”. NN people like to think about this in particular way which I like because of the diversity of out-of-the-box ideas it invites and which I dislike because it is sloppy. The central idea here is learning well-factored causal graphical models and everything else is just an approximation to that. The reason this is hot topic in neural nets, I suspect, is that it is convenient for massive, low-human-effort neural networks to ignore graphical structure to get good results from regressions in observational data. To recovery the fancy performance in a black-box model is even more tedious than a classical one. Also it fits the social conventions of neural network research to reinvent methods to fix such problems without reference to previous conventions, for better and worse.

One thing that the machine learning set up gives us which is an additional emphasis: external validity, the traditional framing, would ask you whether the model you have learnt is still useful on new data. The transfer learning set up invites use to consider if we can transfer some of the computational effort from learning on one data set to learning on new dataset, and if so, how much.

This connects also to semi-supervised learning and fairness, argues (Bernhard Schölkopf Bernhard et al. 2012; Bernhard Schölkopf 2019).

Standard graphical models

We can just try some basic graphical model technology and see how far we get. If the right independences are enforced, presumably we are doing something not too far from learning a transferable model? Or, if we work out that the necessary parameters are not identifiable, then we discover that we cannot in fact learn a transferable model, right? (But maybe we can learn a somewhat transferable model?) I guess the key weakness here is that graphical models will miss some types of transferability, specifically, independences that are dependent on particualr values of the nodes, so this might be less powerful.



salad is a library to easily setup experiments using the current state-of-the art techniques in domain adaptation. It features several of recent approaches, with the goal of being able to run fair comparisons between algorithms and transfer them to real-world use cases.


Bongers, Stephan, Patrick Forré, Jonas Peters, Bernhard Schölkopf, and Joris M. Mooij. 2020. “Foundations of Structural Causal Models with Cycles and Latent Variables.” October 8, 2020. http://arxiv.org/abs/1611.06221.
D’Amour, Alexander, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, et al. 2020. “Underspecification Presents Challenges for Credibility in Modern Machine Learning.” November 6, 2020. http://arxiv.org/abs/2011.03395.
Deaton, Angus, and Nancy Cartwright. 2016. “Understanding and Misunderstanding Randomized Controlled Trials.” Working Paper 22595. National Bureau of Economic Research. https://doi.org/10.3386/w22595.
Kilbertus, Niki, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. “Avoiding Discrimination Through Causal Reasoning.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 656–66. Curran Associates, Inc. http://papers.nips.cc/paper/6668-avoiding-discrimination-through-causal-reasoning.pdf.
Olteanu, Alexandra, Carlos Castillo, Fernando Diaz, and Emre Kıcıman. 2019. “Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries.” Frontiers in Big Data 2. https://doi.org/10.3389/fdata.2019.00013.
Pearl, Judea, and Elias Bareinboim. 2014. “External Validity: From Do-Calculus to Transportability Across Populations.” Statistical Science 29 (4): 579–95. https://doi.org/10.1214/14-STS486.
Peters, Jonas, Dominik Janzing, and Bernhard Schölkopf. 2017. Elements of Causal Inference: Foundations and Learning Algorithms. Adaptive Computation and Machine Learning Series. Cambridge, Massachuestts: The MIT Press. https://www.dropbox.com/s/dl/gkmsow492w3oolt/11283.pdf.
Quiñonero-Candela, Joaquin. 2009. Dataset Shift in Machine Learning. Cambridge, Mass.: MIT Press. http://ndl.ethernet.edu.et/handle/123456789/47647.
Schölkopf, Bernhard. 2019. “Causality for Machine Learning.” December 23, 2019. http://arxiv.org/abs/1911.10500.
Schölkopf, Bernhard, Bernhard, Dominik Janzing, Jonas Peters, Eleni Sgouritsa, Kun Zhang, and Joris Mooij. 2012. “On Causal and Anticausal Learning.” In ICML 2012. http://arxiv.org/abs/1206.6471.
Schölkopf, Bernhard, David W. Hogg, Dun Wang, Daniel Foreman-Mackey, Dominik Janzing, Carl-Johann Simon-Gabriel, and Jonas Peters. 2015. “Removing Systematic Errors for Exoplanet Search via Latent Causes.” May 12, 2015. http://arxiv.org/abs/1505.03036.
Schram, Arthur. 2005. “Artificiality: The Tension Between Internal and External Validity in Economic Experiments.” Journal of Economic Methodology 12 (2): 225–37. https://doi.org/10.1080/13501780500086081.
Subbaswamy, Adarsh, Peter Schulam, and Suchi Saria. 2019. “Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport.” In The 22nd International Conference on Artificial Intelligence and Statistics, 3118–27. PMLR. http://proceedings.mlr.press/v89/subbaswamy19a.html.
Verma, Sahil, John Dickerson, and Keegan Hines. 2020. “Counterfactual Explanations for Machine Learning: A Review.” In, 22.

Warning! Experimental comments system! If is does not work for you, let me know via the contact form.

No comments yet!

GitHub-flavored Markdown & a sane subset of HTML is supported.