What are human values?

If we assume that humans are pursuing stuff, what is that stuff? Thick models of value, eudaimonic rationality,…

2025-06-05 — 2026-04-09

Wherein the Assumption of Utility Maximisation Is Set Aside, and Alternative Framings of Human Goodness Are Examined, With Reference to Eudaimonic Rationality and Open-Ended Intelligence.

adaptive

agents

AI safety

economics

evolution

extended self

game theory

gene

incentive mechanisms

learning

mind

probability

sociology

statistics

statmech

utility

wonk

A placeholder. Let us relax the assumption that humans are best understood as acting to maximise their utility. How else can we understand what is good to have more of, from these open-ended intelligences?

cf ecology of mind.

1 Incoming

Paretotopian Goal Alignment
Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value
After Orthogonality: Virtue-Ethical Agency and AI Alignment

The concept of eudaimonia, I argue, suggests a form of rational activity without a strict distinction between means and ends, or between ‘instrumental’ and ‘terminal’ values. In this model of rational activity, a rational action is an element of a valued practice in roughly the same sense that a note is an element of a melody, a time-step is an element of a computation, and a moment in an organism’s cellular life is an element of that organism’s self-subsistence and self-development.[…]

My central claim is that our intuitions about the nature of human flourishing are implicitly intuitions that eudaimonic rationality can be functionally robust in a sense highly critical to AI alignment. More specifically, I argue that in light of our best intuitions about the nature of human flourishing it’s plausible that eudaimonic rationality is a natural form of agency, and that eudaimonic rationality is effective even by the light of certain consequentialist approximations of its values. I then argue that if our goal is to align AI in support of human flourishing, and if it is furthermore plausible that eudaimonic rationality is natural and efficacious, then many classical AI safety considerations and ‘paradoxes’ of AI alignment speak in favor of trying to instill AIs with eudaimonic rationality._

2 References

Carroll, Foote, Siththaranjan, et al. 2024. “AI Alignment with Changing and Influenceable Reward Functions.”

Collins, Sucholutsky, Bhatt, et al. 2024. “Building Machines That Learn and Think with People.”

Doudkin, Pataranutaporn, and Maes. 2025. “AI Persuading AI Vs AI Persuading Humans: LLMs’ Differential Effectiveness in Promoting Pro-Environmental Behavior.”

Edelman, Tan, Lowe, et al. 2025. “Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value.” In.

Franklin, and Ashton. 2022. “Preference Change in Persuasive Robotics.”

Franklin, Ashton, Gorman, et al. 2022. “Recognising the Importance of Preference Change: A Call for a Coordinated Multidisciplinary Research Effort in the Age of AI.”

Gabriel. 2020. “Artificial Intelligence, Values, and Alignment.” Minds and Machines.

Gabriel, Manzini, Keeling, et al. 2024. “The Ethics of Advanced AI Assistants.”

Hadfield-Menell, and Hadfield. 2018. “Incomplete Contracting and AI Alignment.”

Hyland, Gavenčiak, Costa, et al. 2024. “Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents.” In.

Kim. 2020. “Deep Learning and Principal–Agent Problems of Algorithmic Governance: The New Materialism Perspective.” Technology in Society.

Klingefjord, Lowe, and Edelman. 2024. “What Are Human Values, and How Do We Align AI to Them?”

Kulveit, Douglas, Ammann, et al. 2025. “Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development.”

Liu, Wang, Li, et al. 2024. “Attaining Human Desirable Outcomes in Human-AI Interaction via Structural Causal Games.”

Pettigrew. 2019. Choosing for Changing Selves.

Samuelson. 1938. “A Note on the Pure Theory of Consumer’s Behaviour.” Economica.

Stray, Vendrov, Nixon, et al. 2021. “What Are You Optimizing for? Aligning Recommender Systems with Human Values.”

Ying, Zhi-Xuan, Wong, et al. 2025. “Understanding Epistemic Language with a Language-Augmented Bayesian Theory of Mind.”

Zhi-Xuan, Carroll, Franklin, et al. 2025. “Beyond Preferences in AI Alignment.” Philosophical Studies.