Utility and fitness
Wants versus needs, selection theorems
2025-06-05 — 2025-09-23
Wherein the identification of utility with Malthusian log‑fitness is shown in a local, log‑linear regime, and evolution’s response is described as a constrained gradient step shaped by the genetic covariance G.
Fitness, in evolutionary biology, measures an organism’s expected reproductive success. Utility, in economics and decision theory, measures an agent’s preferences, i.e. it is what we seek out.
We often blur the lines between what an organism wants and what it evolutionarily needs. Why do we love sugar? The standard explanation is that in ancestral environments, sweetness signalled calorie density, which aided survival and reproduction. Our preferences (our “utility” for sweetness) were shaped by the “fitness” benefit of calories.
This intuition runs deep. Evolutionary biologists often describe organisms as if they are maximizing fitness. Similarly, economists have long argued that competition forces firms to act as if they are maximizing profits, regardless of the managers’ actual intentions (Friedman 1953). There is a whole field of evolutionary psychology that attempts to explain human desires as adaptations to ancestral environments. In genetic programming we attempt to evolve programs that maximize a fitness function, effectively treating the search process as an optimization problem.
This “as-if” optimization is a neat heuristic, but it also clearly not exact. My evolved taste for sugar, once adaptive, is now easily hijacked by modern junk food, leading to outcomes misaligned with my long-term health. The utility function that evolution built into me is no longer a perfect proxy for my fitness. So these notions can come apart.
So, how exactly do utility and fitness relate? When are they the same, and when do they diverge?
This post aims to make the connection more precise. We will see that under certain conditions, fitness and utility align mathematically. But we will also explore why this alignment often breaks down in the complex, interactive world.
1 Definitions
1.1 Utility: What an Agent Wants
In economics and decision theory, utility is a mathematical representation of preferences. If I prefer A to B, my utility function \(u\) assigns a higher number to A: \(u (A) > u (B)\).
The von Neumann-Morgenstern (VNM) framework deals with preferences under uncertainty (von Neumann and Morgenstern 1944). If an agent’s preferences follow certain axioms of rationality (like transitivity—if I prefer A to B and B to C, I must prefer A to C), then that agent acts as if they are maximizing their expected utility.
Utility is a measure of what an agent wants, as evinced by their choices.
1.2 Fitness: What Evolution Needs
In evolutionary biology, fitness measures an organism’s expected reproductive success. It determines which traits are likely to become more common over generations.
Let’s consider a vector of phenotypic traits, \(\mathbf{z}\) (e.g., beak size, running speed).
- Absolute Fitness (\(W(\mathbf{z})\)): The expected number of offspring an organism with traits \(\mathbf{z}\) will produce.
- Malthusian Fitness (\(m(\mathbf{z})\)): The natural logarithm of absolute fitness. \(m(\mathbf{z}) = \ln W(\mathbf{z})\).
This last definition, Malthusian fitness (or log-fitness), is the key to bridging the gap with utility.
2 The Local Alignment: Evolution as Gradient Ascent
Evolution favours traits that increase fitness. This suggests that populations are climbing a slope in a “fitness landscape”. When does this climbing process look like optimization?
2.1 The Machinery of Selection
Biologists measure how strongly selection acts on traits using the selection gradient (\(\boldsymbol{\beta}\)). This gradient is essentially the slope of the regression of relative fitness on the traits (Lande and Arnold 1983). It points in the direction where fitness increases most steeply within the current population.
The response to selection—how the average trait changes in the next generation—is given by the Multivariate Breeder’s Equation (Lande 1979):
\[ \Delta \bar{\mathbf{z}} = \mathbf{G}\boldsymbol{\beta} \]
Here, \(\Delta \bar{\mathbf{z}}\) is the change in the average phenotype. \(\mathbf{G}\) is the additive genetic covariance matrix. That matrix describes how traits are inherited together and constrains the directions evolution can take. The equation tells us that evolution moves the average phenotype in the direction of \(\boldsymbol{\beta}\), but the path is shaped by the genetic constraints encoded in \(\mathbf{G}\).
2.2 The “As-If” Equivalence
Now we can connect the selection gradient \(\boldsymbol{\beta}\) to the idea of a fitness landscape. Let’s look at Malthusian fitness, \(m(\mathbf {z})\). We want to know the gradient of this landscape, \(\nabla m\), which points towards the steepest increase in log-fitness.
It turns out there’s a connection. In many real-world scenarios, trait variation within a population is small and the fitness landscape is smooth. We can approximate that landscape locally with a first-order Taylor expansion.
In this “log-linear local regime,” an interesting identity emerges (Orr 2007; Morrissey and Goudie 2022):
\[ \boldsymbol{\beta} \approx \nabla m \]
The selection gradient, as measured by regression, is approximately equal to the gradient of the Malthusian fitness landscape. Substitute this back into the Breeder’s Equation:
\[ \Delta \bar{\mathbf{z}} \approx \mathbf{G}\nabla m \]
Cool. It turns out that the evolutionary response is a constrained gradient ascent on Malthusian fitness.
The Local Equivalence:
In this local regime, if we define an “as-if utility” \(u \equiv m\) (utility equals log-fitness), evolution behaves precisely as if it’s maximizing its utility function, subject to the constraints imposed by \(\mathbf {G}\) (A. Grafen 2007; Gardner 2009). The selection gradient (\(\boldsymbol{\beta}\)) corresponds exactly to the marginal utility of the traits (\(\nabla u\)) to the organism.
This is why the analogy between fitness and utility feels so strong: locally, under common conditions, they are mathematically equivalent.
Note however, that we have talked about traits here, not behaviours. We’ll come back to the latter soon.
3 The Long Game
Evolution operates over time in uncertain environments. This introduces another, perhaps deeper, reason why Malthusian fitness (log-fitness) acts as the utility function that evolution maximizes.
Evolution is a multiplicative process. If my lineage has two offspring in the first generation, and each of them has three offspring in the second, I have \(2 \times 3 = 6\) descendants.
If fitness varies due to environmental randomness, the long-term outcome depends on the geometric mean fitness, not the arithmetic mean. Because of compounding, a strategy with high variance is risky: a few bad generations can severely hamper long-term growth or even cause extinction.
Mathematically, maximizing the long-run growth rate of a lineage is equivalent to maximizing the expected value of the logarithm of fitness (Cohen 1966):
\[ \text{Maximize } \mathbb{E}[\ln W] \]
This is mathematically identical to maximizing expected utility where \(u = \ln W\). This concept matches the Kelly Criterion in finance and gambling (Kelly 1956; Breiman 1961). If we reinvest our winnings in a multiplicative gamble, the strategy that maximizes long-run wealth is the one that maximizes the expected logarithm of the return.
The Implication: Evolutionary Risk Aversion
Because the logarithm function is concave (it curves downward), maximizing \(\mathbb {E}[\ln W]\) penalizes variance. This is known as “bet-hedging” (Philippi and Seger 1989). Evolution is risk-averse, not because of a psychological preference, but because multiplicative growth demands it.
In this long-term perspective, utility is log-fitness, not merely as a local approximation, but as the fundamental objective function dictated by the dynamics of compounding growth under uncertainty.
4 This is suspect because fitness landscapes are somewhat fake
People from machine learning theory are always ready to believe in fitness landscapes because we spend all day thinking about loss landscapes. After due consideration, I must confess that fitness landscapes seen even more fake than loss landscapes. Or at least, there is not a single fitness landscape shared by a genotype in most selection processes in the wild.
4.1 Frequency Dependence and Game Theory
The most significant breakdown occurs when fitness is frequency-dependent—that is, the success of our strategy depends on what everyone else is doing.
In this case, the fitness landscape is not static. As the population evolves, the landscape itself shifts. There is no single, static utility function that evolution maximizes. Instead, we must use the tools of Evolutionary Game Theory (Maynard Smith 1982). Evolution may cycle (like in Rock-Paper-Scissors) or reach complex equilibria rather than climbing to a peak.
4.2 Evolutionary Mismatch and Proxy Goals
Evolution is slow. The utility functions encoded in our brains were shaped by ancestral environments. When the environment changes rapidly (as human environments have), these evolved utilities can become misaligned with current fitness. Our preference for sugar is a prime example of this mismatch.
Utility often tracks proxies for fitness—pleasure, status, comfort. These proxies can be hijacked by superstimuli, leading to behaviors that satisfy utility but decrease fitness.
4.3 Niche Construction and Feedback
If organisms actively modify their environment (niche construction), the fitness landscape becomes dynamic and path-dependent (Odling-Smee, Laland, and Feldman 2003). The environment we face today depends on the actions of our ancestors. This feedback loop complicates the idea of simple optimization, as the optimization target is constantly moving (Schulz 2014).
4.4 Multi-Level Selection
When selection acts at multiple levels (e.g., individuals within groups, and groups competing with other groups), stuff gets weird. What maximizes individual fitness might decrease group fitness (e.g., tragedy of the commons). Defining a coherent “group utility” becomes highly problematic (Okasha and Okasha 2008; Okasha 2009) and is a whole running battle in evolutionary biology.
But the biggest distinction IMO is that we still haven’t thought about how agents might not simply enact traits, but might have behaviours, and it is behaviours that are the domain of utility theory.
5 Selection Theorems
The connection we’ve drawn between fitness and utility—where evolutionary pressure forces behaviour to align with maximizing log-fitness—is not a unique phenomenon. It is one example of what we might imagine is a more general principle where Selection forces structure.
When a system—whether a biological organism, a firm in a market, or an AI algorithm—is strongly optimized to perform well according to some criterion, the winning systems tend to share certain internal organizations. If a specific architecture is required for optimal performance, only systems possessing that architecture will survive selection.
This looks more like an argument about utilities; it is about how selection pressures might force systems to behave as if they are maximizing a utility function which is related in some useful way to their fitness. This idea is formalized in what John Wentworth calls Selection Theorems. These theorems follow a general template:
If a system is selected to perform well under a criterion \(\mathcal {C}\), then near-optimal elements must possess a certain internal structure or “type signature” \(\mathcal {T}\).
Let’s look at a classic example to see how this might work.
5.1 Example: The Necessity of Consistency (Coherence Theorems)
Now consider a different kind of pressure: surviving in a competitive environment where resources are at stake, like a market or a betting scenario. The selection criterion here is simple: avoid being exploited into a guaranteed loss.
Agents can fail this criterion in two main ways: inconsistent preferences or inconsistent beliefs.
Inconsistent Preferences (The Money Pump): Imagine an agent who prefers Apples to Bananas (A>B), Bananas to Carrots (B>C), but also Carrots to Apples (C>A). This agent can become a “money pump”. A trader could convince the agent to pay a small fee to trade their Carrot for a Banana (since B>C), then to pay a fee to trade the Banana for an Apple (A>B), and finally to pay a fee to trade the Apple back for a Carrot (C>A). The agent ends up back where they started, but poorer.
Inconsistent Beliefs (The Dutch Book): Suppose an agent has beliefs that violate the laws of probability. For example, they believe there is a 60% chance Team X will win a game, and a 60% chance Team X will lose that same game.
A sharp adversary can construct a Dutch Book against this agent. The adversary offers two bets, priced according to the agent’s beliefs:
- Bet 1: Pays $1 if Team X wins. The agent, believing this is 60% likely, is willing to pay up to $0.60 for this bet.
- Bet 2: Pays $1 if Team X loses. The agent is also willing to pay up to $0.60 for this bet.
The adversary sells both bets to the agent for $0.60 each. The agent has spent a total of $1.20. However, only one outcome can occur (win or loss), so the agent will only win back $1.00. The agent has accepted a combination of bets that guarantees a net loss of $0.20, no matter the outcome.
The Result: Coherence Theorems (including VNM utility and Dutch Book arguments) prove that if an agent is to avoid guaranteed losses, their beliefs and preferences must be internally consistent (von Neumann and Morgenstern 1944).
- Selection Criterion (\(\mathcal{C}\)): Avoid being exploited for guaranteed losses (non-domination).
- Resulting Structure (\(\mathcal{T}\)): The agent must act as if it has a consistent probability distribution (its beliefs obey the laws of probability) and a consistent utility function (its preferences are transitive). In other words, they must behave as an Expected Utility Maximizer.
These theorems don’t argue that agents “should” maximize utility for philosophical reasons. They argue that agents who don’t will be exploited, outcompeted, and ultimately eliminated.
Note, however, that they say much less about how long we can get away with being irrational in this sense. TBC