World models arising in foundation models.

December 20, 2024 — May 22, 2025

approximation
Bayes
generative
language
machine learning
meta learning
Monte Carlo
neural nets
NLP
optimization
probabilistic algorithms
probability
statistics
stringology
time series
Figure 1

Placeholder, for notes on what kind of world models reside in neural nets.

1 Creating worlds to model

Rosas, Boyd, and Baltieri (2025) makes a pleasing connection to the simulation hypothesis:

Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment to ensure their reliability and safety. However, accurate world models often have high computational demands that can severely restrict the scope and depth of such assessments. Inspired by the classic `brain in a vat’ thought experiment, here we investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation. By following principles from computational mechanics, our approach reveals a fundamental trade-off in world model construction between efficiency and interpretability, demonstrating that no single world model can optimise all desirable characteristics. Building on this trade-off, we identify procedures to build world models that either minimise memory requirements, delineate the boundaries of what is learnable, or allow tracking causes of undesirable outcomes. In doing so, this work establishes fundamental limits in world modelling, leading to actionable guidelines that inform core design choices related to effective agent evaluation.

2 Platonic Representation Hypothesis

See Platonic Representation Hypothesis for a discussion of the idea that the latent space of a neural net is a Platonic representation of the world.

3 Causal world models

See causal abstraction for a discussion of the idea that the latent space of a neural net can discover causal representations of the world.

4 Incoming

NeurIPS 2023 Tutorial: Language Models meet World Models

5 References

Basu, Grayson, Morrison, et al. 2024. Understanding Information Storage and Transfer in Multi-Modal Large Language Models.”
Chirimuuta. 2025. The Prehistory of the Idea That Thinking Is Modelling.” Human Arenas.
Ge, Huang, Zhou, et al. 2024. WorldGPT: Empowering LLM as Multimodal World Model.” In Proceedings of the 32nd ACM International Conference on Multimedia. MM ’24.
Hao, Gu, Ma, et al. 2023. Reasoning with Language Model Is Planning with World Model.”
Hu, and Shu. 2023. Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning.”
Richens, and Everitt. 2024. Robust Agents Learn Causal World Models.”
Rosas, Boyd, and Baltieri. 2025. AI in a Vat: Fundamental Limits of Efficient World Modelling for Agent Sandboxing and Interpretability.”
Wong, Grand, Lew, et al. 2023. From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought.”
Yildirim, and Paul. 2024. From Task Structures to World Models: What Do LLMs Know? Trends in Cognitive Sciences.