What use is utility?

If we must use the expected utility maximizer model for humans, what is the utility we should use?

2025-06-05 — 2026-04-09

Wherein the Implied Utility Functions of Animals Are Considered in Light of Machine-Learning Optimisation, and Value Learning Is Introduced as a Framework for Inferring Preferences From Behaviour.

adaptive

agents

AI safety

economics

evolution

extended self

game theory

gene

incentive mechanisms

learning

mind

probability

sociology

statistics

statmech

utility

wonk

Because machine learning models so often optimise a loss function, there is a degree to which we must internalise a world in which agents have something like a fixed utility which they pursue, because at least some agents do something like that.

If we need to construct such a pretence for animals, what does the implied utility function look like?

cf ecology of mind, what are human values.

And note of course, that many ML algorithms don’t need utilities, but might subsist on intrinsic motivation.

1 Value learning

One answer comes from reinforcement learning.. See value learning.

2 References

Biehl, and Virgo. 2023. “Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency.” In.

Edelman, Tan, Lowe, et al. 2025. “Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value.” In.

Gabriel. 2020. “Artificial Intelligence, Values, and Alignment.” Minds and Machines.

Gabriel, Manzini, Keeling, et al. 2024. “The Ethics of Advanced AI Assistants.”

Hafner, Ortega, Ba, et al. 2022. “Action and Perception as Divergence Minimization.”

Kim. 2020. “Deep Learning and Principal–Agent Problems of Algorithmic Governance: The New Materialism Perspective.” Technology in Society.

Klingefjord, Lowe, and Edelman. 2024. “What Are Human Values, and How Do We Align AI to Them?”

Lanctot, Larson, Kaisers, et al. 2025. “Soft Condorcet Optimization for Ranking of General Agents.”

Liu, Wang, Li, et al. 2024. “Attaining Human Desirable Outcomes in Human-AI Interaction via Structural Causal Games.”

MacDermott, Fox, Belardinelli, et al. 2024. “Measuring Goal-Directedness.”

Marklund, Infanger, and Roy. 2025. “Misalignment from Treating Means as Ends.”

Maura-Rivero, Lanctot, Visin, et al. 2025. “Jackpot! Alignment as a Maximal Lottery.”

Maura-Rivero, Nagpal, Patel, et al. 2025. “Utility-Inspired Reward Transformations Improve Reinforcement Learning Training of Language Models.”

Munos, Valko, Calandriello, et al. 2024. “Nash Learning from Human Feedback.” In Proceedings of the 41st International Conference on Machine Learning. ICML’24.

Samuelson. 1938. “A Note on the Pure Theory of Consumer’s Behaviour.” Economica.

Siththaranjan, Laidlaw, and Hadfield-Menell. 2023. “Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF.” In.

Zhi-Xuan, Carroll, Franklin, et al. 2025. “Beyond Preferences in AI Alignment.” Philosophical Studies.