Bayesian and causal inference by foundation models

2024-08-29 — 2025-02-24

Suspiciously similar content

Placeholder, for exploring the idea that transformers or their ilk might be good at actual general Bayes and even causal inference, without explicitly designing causality in to their loss functions.

As set functions, transformers look a lot like ‘generalized inference machines’. Are they? Can we make them do ‘proper’ causal inference in some formal sense?

This is a scrapbook of interesting approaches; Bayesian inference over LLM outputs, understanding in-context learning as Bayesian conditioning, and so on.

Last time I checked, this phenomenon was understood empirically; there are lots of reasons we might imagine it can happen in practice.

Probably connected: Mechanistic interpretability, causal inference in foundation models, explicitly bayesian predictors and so on.

1 Causal Abstraction

Geiger et al. (2024) builds upon Correa and Bareinboim (2020):

In some ways, studying modern deep learning models is like studying the weather or an economy: they involve large numbers of densely connected ‘microvariables’ with complex, non-linear dynamics. One way of reining in this complexity is to find ways of understanding these systems in terms of higher-level, more abstract variables (‘macrovariables’). For instance, the many microvariables might be clustered together into more abstract macrovariables. A number of researchers have been exploring theories of causal abstraction, providing a mathematical framework for causally analyzing a system at multiple levels of detail

Indeed they have. See Causal Abstraction

2 Probabilistic sampling from transformers

Alireza Makhzani introduces Zhao et al. (2024):

Many capability and safety techniques of LLMs—such as RLHF, automated red-teaming, prompt engineering, and infilling—can be viewed from a probabilistic inference perspective, specifically as sampling from an unnormalised target distribution defined by a given reward or potential function. Building on this perspective, we propose using twisted Sequential Monte Carlo (SMC) as a principled probabilistic inference framework to approach these problems. Twisted SMC is a variant of SMC with additional twist functions that predict the future value of the potential at each timestep, enabling the inference to focus on promising partial sequences. We show the effectiveness of twisted SMC for sampling rare, undesirable outputs from a pretrained model (useful for harmlessness training and automated red-teaming), generating reviews with varied sentiment, and performing infilling tasks.

Our paper offers much more! We propose a novel twist learning method inspired by energy-based models; we connect the twisted SMC literature with soft RL; we propose novel bidirectional SMC bounds on log partition functions as a method for evaluating inference in LLMs; and finally we provide probabilistic perspectives for many more controlled generation methods in LLMs.

More methods in the references.

3 Designing foundation models explicitly to do Bayes-in-context

See Predictive Bayes NNs.

4 In AI Safety

5 Incoming

Transformers are Graph Neural Networks
Pascal Hirsch mentions Reuter et al. (2025)

Transformers have emerged as the dominant architecture in the field of deep learning, with a broad range of applications and remarkable in-context learning (ICL) capabilities. While not yet fully understood, ICL has already proved to be an intriguing phenomenon, allowing transformers to learn in context – without requiring further training. In this paper, we further advance the understanding of ICL by demonstrating that transformers can perform full Bayesian inference for commonly used statistical models in context. More specifically, we introduce a general framework that builds on ideas from prior fitted networks and continuous normalizing flows which enables us to infer complex posterior distributions for methods such as generalized linear models and latent factor models. Extensive experiments on real-world datasets demonstrate that our ICL approach yields posterior samples that are similar in quality to state-of-the-art MCMC or variational inference methods not operating in context.

6 References

Correa, and Bareinboim. 2020. “A Calculus for Stochastic Interventions:Causal Effect Identification and Surrogate Experiments.” Proceedings of the AAAI Conference on Artificial Intelligence.

Everitt, Carey, Langlois, et al. 2021. “Agent Incentives: A Causal Perspective.” In Proceedings of the AAAI Conference on Artificial Intelligence.

Geiger, Ibeling, Zur, et al. 2024. “Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability.”

Gloeckler, Deistler, Weilbach, et al. 2024. “All-in-One Simulation-Based Inference.”

Guo, Cheng, Li, et al. 2020. “A Survey of Learning Causality with Data: Problems and Methods.” ACM Computing Surveys.

Hammond, Fox, Everitt, et al. 2023. “Reasoning about Causality in Games.” Artificial Intelligence.

Hubinger, Jermyn, Treutlein, et al. 2023. “Conditioning Predictive Models: Risks and Strategies.”

Huh, Cheung, Wang, et al. 2024. “The Platonic Representation Hypothesis.”

Kinney, and Lombrozo. 2024. “Building Compressed Causal Models of the World.” Cognitive Psychology.

Korbak, Perez, and Buckley. 2022. “RL with KL Penalties Is Better Viewed as Bayesian Inference.”

Liu, Zhang, Gong, et al. 2022. “Identifying Latent Causal Content for Multi-Source Domain Adaptation.”

Melnychuk, Frauen, and Feuerriegel. 2022. “Causal Transformer for Estimating Counterfactual Outcomes.” In Proceedings of the 39th International Conference on Machine Learning.

Müller, Hollmann, Arango, et al. 2021. “Transformers Can Do Bayesian Inference.” In.

Murfet, Clift, Doyrn, et al. 2020. “Logic and the 2-Simplicial Transformer.” International Conference on Learning Representations.

Nichani, Damian, and Lee. 2024. “How Transformers Learn Causal Structure with Gradient Descent.”

Ortega, Kunesch, Delétang, et al. 2021. “Shaking the Foundations: Delusions in Sequence Models for Interaction and Control.” arXiv:2110.10819 [Cs].

Ravfogel, Svete, Snæbjarnarson, et al. 2025. “Gumbel Counterfactual Generation From Language Models.”

Reuter, Rudner, Fortuin, et al. 2025. “Can Transformers Learn Full Bayesian Inference in Context?”

Richens, and Everitt. 2024. “Robust Agents Learn Causal World Models.”

Riechers, Bigelow, Alt, et al. 2025. “Next-Token Pretraining Implies in-Context Learning.”

Saengkyongam, Rosenfeld, Ravikumar, et al. 2024. “Identifying Representations for Intervention Extrapolation.”

Scetbon, Jennings, Hilmkil, et al. 2024. “FiP: A Fixed-Point Approach for Causal Generative Modeling.”

von Kügelgen, Besserve, Wendong, et al. 2023. “Nonparametric Identifiability of Causal Representations from Unknown Interventions.” In Advances in Neural Information Processing Systems.

Wang, Xu, Tong, et al. 2021. “InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance.” Frontiers in Artificial Intelligence.

Ward, MacDermott, Belardinelli, et al. 2024. “The Reasons That Agents Act: Intention and Instrumental Goals.”

Willig, Zečević, Dhami, et al. 2022. “Can Foundation Models Talk Causality?”

Ye, Tianzhu, Dong, Xia, et al. 2024. “Differential Transformer.”

Ye, Naimeng, Yang, Siah, et al. 2024. “Pre-Training and in-Context Learning IS Bayesian Inference a La De Finetti.”

Zečević, Willig, Dhami, et al. 2023. “Causal Parrots: Large Language Models May Talk Causality But Are Not Causal.” Transactions on Machine Learning Research.

Zhao, Brekelmans, Makhzani, et al. 2024. “Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo.” In Proceedings of the 41st International Conference on Machine Learning.