Returns to scale in technological society in general

Is bigger better enough that it is the only thing that can survive? Is small ever beautiful? How about medium-sized?

2026-04-20 — 2026-04-20

Wherein the distinct scaling exponents of cities, firms, and nation-states are compared, and the question of whether AI-mediated coordination may dissolve the Coasean boundary of the firm is raised.

buzzword
cooperation
culture
design
diy
economics
housing
incentive mechanisms
institutions
insurgency
making things
policy
spatial
straya
the rather superior sort of city
wonk
Figure 1

Consider the scaling laws across cities, economies, neural nets… Does it follow that in a competitive selection environment, ultimately, the world converges upon one leviathan? One giant system? Is the attracting state a single, gigantic economy? In AI safety people wonder about the singleton, the ultimate mega agent. Should we think about the singleton economy, the singleton nation state?

Is there any leverage to be small and idiosyncratic where there is such obvious gain in being strong, powerful and all-encompassing? Does the world converge to unipolarity? Do we grind out all the small states, the tiny coalitions, the middle powers? Or is there some kind of stable polycentricity, where there are niches for small and medium sized entities to thrive? I would like to know.

1 Urban scaling laws

The urban scaling results are the empirical anchor for all of this. Bettencourt (2013) provides the theoretical derivation: cities are social reactors, and the superlinear scaling emerges from the increasing number of social interactions per capita as density grows. Bettencourt et al. (2007) established the empirical regularities — wealth creation and innovation scale with exponent \(\beta \approx 1.2\), infrastructure with \(\beta \approx 0.8\).

Methodological critique of the naive approach may be found in Cottineau et al. (2017) — the measured exponent is sensitive to how we define city boundaries, which indicators we use, and which countries we look at. Arcaute et al. (2015) show that for England and Wales, many scaling results vanish or change sign when city boundaries are defined differently. This matters: if the exponent is an artefact of the boundary definition, the whole edifice wobbles.

There is much more (Arcaute et al. 2015; Balland2020Complex?; Bettencourt and Lobo 2016; Cottineau et al. 2017; Kühnert, Helbing, and West 2006; Leishman2021Relationships?; Prieto Curiel, Cabrera-Arnau, and Bishop 2022; Rybski, Arcaute, and Batty 2019)

2 Economic scaling laws

The literature on returns to scale is enormous and old, but I want to slice it by the unit of analysis: firms, nations, cities. The scaling exponents turn out to be different in each case, and the reasons why are the interesting part.

2.1 Firms

Geoffrey West’s Santa Fe Institute group found that companies scale sublinearly — more like biological organisms than cities (West 2017). Sales scale roughly linearly with employee count (exponent \(\approx 1\)), but profitability and innovation per capita do not increase with firm size the way they do in cities. The implication is that firms, like organisms, are bounded growers: they slow down, stop growing, and die. The average lifespan of a publicly traded company is about 10 years.

Why? Coase proposed a reason in 1937 (Coase 1937): firms exist because they reduce transaction costs below what the market would charge, but they expand only until the cost of organising one more internal transaction equals the cost of doing it on the market. There are “decreasing returns to the entrepreneur function” — management overhead, communication pathways (\(O(n^2)\) in the number of interacting units), errors of coordination. Which is to say, firms hit diseconomies of scale because internal coordination is expensive.

This is the Coasean boundary, and it’s the reason we don’t see a single firm absorbing the whole economy — or didn’t, before at least. This might change.

2.2 Nations

Alesina and Spolaore (2003) model national boundaries as an equilibrium between economies of scale in public goods provision and the costs of preference heterogeneity. Bigger nations amortize the cost of armies, courts, infrastructure, and monetary systems over more taxpayers. But larger populations mean more diverse preferences, languages, cultures — harder to govern democratically, harder to produce public goods that satisfy everyone.

Their framework predicts that the optimal size of nations depends on the international environment. Under free trade, small nations can capture scale economies through markets rather than territory, so the equilibrium shifts toward smaller states — which is roughly what we observe since 1945, with the number of internationally recognised states roughly tripling. Under autarky, size matters more, and empires make economic sense.

Empirically, GDP scales sublinearly with national population — an exponent around 0.8, meaning doubling population gives less than double output (Prieto Curiel, Cabrera-Arnau, and Bishop 2022). This is the opposite of the urban scaling result (\(\beta \approx 1.15\) for socioeconomic variables in cities). Nations are not cities. The density of social interaction that drives superlinear urban scaling is diluted across a national territory.

2.3 Some puzzles

We have three different systems with three different scaling regimes:

  • Cities scale superlinearly (\(\beta \approx 1.15\)) for socioeconomic output — innovation, wealth, crime. This is driven by the density of social interactions (Bettencourt et al. 2007).
  • Firms scale roughly linearly to sublinearly. Internal coordination costs impose a ceiling. They behave more like organisms (West 2017).
  • Nations scale sublinearly (\(\beta \approx 0.8\)). Heterogeneity costs and governance overhead eat the returns to public goods provision (Alesina and Spolaore 2003; Prieto Curiel, Cabrera-Arnau, and Bishop 2022).

This gives us a hierarchy: cities are the engine of increasing returns, but both the organisational containers we put around economic activity (firms) and the political containers we put around territory (states) exhibit diminishing returns to scale. The superlinearity of cities is embedded in, and constrained by, sublinear envelopes.

This is a multi-level agency problem dressed in scaling exponents. The city, the firm, the nation — these are not just different units of analysis, they are nested levels of organisation with different dynamics at each level. The scaling exponent \(\beta\) is, in a sense, a summary statistic for the kind of agent a system is at a given level of coarse-graining. Organisms (\(\beta < 1\)): bounded, mortal, efficient. Cities (\(\beta > 1\)): open-ended, accelerating, crisis-prone. The question “at what level do we model agency?” and the question “what is the scaling exponent?” turn out to be the same question asked in different vocabularies.

3 The cosmic version

Wong and Bartlett (2022) push this question to its most cosmic limit. They take the urban scaling exponent \(\beta > 1\) from Bettencourt et al. (Bettencourt et al. 2007) and ask what happens when the entire planetary civilisation becomes, in effect, one city — one densely networked informational superorganism connected through the dataome.1 The growth equation

\[ \frac{dN(t)}{dt} = \left(\frac{Y_0}{E}\right) N(t)^\beta - \left(\frac{R}{E}\right) N(t) \]

predicts singularities — moments when population and energy demand tend to infinity in finite time — which must be averted by innovations that reset the system’s trajectory. But the interval between resets, \(t_{\text{cycle}}\), shrinks as the population grows. Eventually \(t_{\text{cycle}}\) drops below the timescale on which innovation can actually occur (\(t_{\text{innovate}}\)), and the system faces what they call asymptotic burnout: collapse driven by the superlinear dynamics that previously drove growth.

Their proposed resolution is homeostatic awakening — the civilization consciously reorienting away from unbounded growth toward persistence and well-being. Which is to say, the civilization invents a new objective function. This is presented as a resolution to the Fermi paradox: either civilizations burn out (short-lived), or they pivot to homeostasis and become quiet (long-lived but difficult to detect). Either way, no galaxy-spanning Type III civilizations.

I find this idea interesting as a question organizer even if the quantitative model is schematic. The key variable is \(\beta\): whether the scaling exponent for a given system sits above or below 1 determines whether it exhibits increasing returns (and eventual crisis) or diminishing returns (and eventual stagnation). And the value of \(\beta\) is not the same for all systems.

4 AI and the Coasean boundary

Figure 2

If the Coasean boundary of the firm is set by the cost of internal coordination relative to market transaction costs, and if AI agents radically reduce both, what happens to the equilibrium firm size?

There are at least two competing effects. Korinek and Vipra (2025) argue that AI itself exhibits enormous economies of scale — training frontier models requires billions of dollars in compute, and the resulting capabilities can be deployed at near-zero marginal cost. This pushes toward concentration: a few firms with the best models capture most of the value.

But there is a countervailing force. If AI agents can handle complex contracting, negotiation, and monitoring at scale — Coasean bargaining at scale, as it were — then transaction costs on the market side also collapse. The “Headless Firm” model (Harré and Ormerod 2025) proposes that agentic AI changes how coordination costs scale: from \(O(n^2)\) in human-managed organisations to \(O(n)\) in protocol-mediated agent systems. If so, we might see a simultaneous expansion of what large organisations can coordinate and shrinkage of what needs to be inside an organisation at all. Which effect dominates?

Wolpert and Harper (2025) model societies as computers, where the computational power available depends on how agents are organised. If AI lets us reorganise the topology of economic interaction — not just doing the same coordination faster, but enabling coordination structures that were previously impossible — then the scaling exponent itself might change. We might move the boundary between the firm-like regime (\(\beta \leq 1\)) and the city-like regime (\(\beta > 1\)).

I don’t know which effect wins. Possibly both at once, in different domains. But the stakes are those of Wong and Bartlett (2022): if we push \(\beta\) high enough, we push \(t_{\text{cycle}}\) down, and we accelerate the approach toward burnout — or toward the moment we have to decide whether homeostatic awakening is something we can actually do.

5 Where does this leave the Singleton?

The scaling laws give us a tentative answer to the opening question. The world probably does not converge on one system, because the scaling exponents differ across levels of organisation. Cities generate superlinear returns, but the containers that govern cities — firms and states — face diminishing returns to scale. The Leviathan is bounded.

But that conclusion rests on current coordination technology. If AI changes the exponents — if it makes large-scale coordination cheap enough that firms or states start exhibiting city-like superlinear scaling — then the attractor shifts. The singleton becomes more plausible, or at least the threshold for “too big” moves upward.

There is a flavour of technological determinism lurking here — the idea that coordination technology determines the viable scale of organisation, and therefore the shape of political and economic life. If the printing press enabled the nation state, and if the internet enabled global supply chains and platform monopolies, then AI-mediated coordination might enable — or force — another jump in the viable scale of collective action. The scaling exponents are not constants of nature; they are artefacts of the coordination technology available. Change the technology, change \(\beta\), change the equilibrium size of everything.

This is not quite hard technological determinism — we are not claiming the technology uniquely determines outcomes. Alesina and Spolaore (2003) already argues that the equilibrium depends on the international security environment, trade openness, and democratic norms, not just coordination costs. But the scaling exponents set the envelope of what is possible, and the technology moves the envelope. Soft determinism, maybe. The tools constrain the menu; the polity still orders from it.

Then we are back to Wong and Bartlett’s question: a civilization with a sufficiently high \(\beta\) will face accelerating singularities. Is homeostatic awakening something that can be engineered? Or is it the kind of thing a civilization only stumbles into after a sufficiently frightening brush with burnout? And — the multi-level agency angle — who does the awakening? The civilization, the nation, the firm? A single God-like uploaded CEO? Roco’s basilisk? If the scaling exponents differ across levels, the incentive to keep growing differs across levels too. The firm-level agents face sublinear returns and might happily plateau; the city-level dynamics keep accelerating regardless of what any individual agent wants. The homeostatic awakening, if it happens, has to happen at the level where \(\beta > 1\) — which may not be the level at which anyone has decision-making authority.

TODO: dig into Wolpert and Harper (2025) more carefully — their “society as computer” model might give us a way to formalise when coordination technology changes the scaling exponent versus when it merely moves us along the existing curve.

TODO: the biological scaling literature deserves its own subsection. Kleiber’s law (\(\beta = 3/4\) for metabolic rate) and the West-Brown-Enquist model are the template that Wong and Bartlett are generalising. I have not done this justice.

TODO: the Anderson Imagined Communities connection is worth chasing — there’s a history-of-coordination-technology story here that connects the printing press → nation state → telegraph → empire → internet → platform monopoly → AI → ??? sequence. Each technology changes the coordination exponent for a different level of organisation.

6 References

Alesina, and Spolaore. 2003. The Size of Nations.
Arcaute, Hatna, Ferguson, et al. 2015. Constructing Cities, Deconstructing Scaling Laws.” Journal of the Royal Society Interface.
Bettencourt. 2013. The Origins of Scaling in Cities.” Science.
Bettencourt, and Lobo. 2016. Urban Scaling in Europe.” Journal of the Royal Society Interface.
Bettencourt, Lobo, Helbing, et al. 2007. Growth, Innovation, Scaling, and the Pace of Life in Cities.” Proceedings of the National Academy of Sciences.
Bettencourt, Lobo, Strumsky, et al. 2010. Urban Scaling and Its Deviations: Revealing the Structure of Wealth, Innovation and Crime Across Cities.” PLOS ONE.
Brill. 2024. Neural Scaling Laws Rooted in the Data Distribution.”
Coase. 1937. The Nature of the Firm.” Economica.
Cottineau, Hatna, Arcaute, et al. 2017. Diverse Cities or the Systematic Paradox of Urban Scaling Laws.” Computers, Environment and Urban Systems, Spatial analysis with census data: emerging issues and innovative approaches,.
Douglas, and Verstyuk. 2025. Progress in Artificial Intelligence and Its Determinants.”
Grace. 2013. Algorithmic Progress in Six Domains.”
Harré, and Ormerod. 2025. The Coasean Reversal: AI and the Boundaries of the Firm — a Theoretical Framework Synthesising Marshall, Coase, Aoki, and Kauffman.”
Hoffmann, Borgeaud, Mensch, et al. 2022. Training Compute-Optimal Large Language Models.”
Hooker. 2020. The Hardware Lottery.” arXiv:2009.06489 [Cs].
Kaplan, McCandlish, Henighan, et al. 2020. Scaling Laws for Neural Language Models.” arXiv:2001.08361 [Cs, Stat].
Korinek, and Vipra. 2025. Concentrating Intelligence: Scaling and Market Structure in Artificial Intelligence.” Economic Policy.
Kühnert, Helbing, and West. 2006. Scaling Laws in Urban Supply Networks.” Physica A: Statistical Mechanics and Its Applications, Information and Material Flows in Complex Networks,.
Prieto Curiel, Cabrera-Arnau, and Bishop. 2022. Scaling Beyond Cities.” Frontiers in Physics.
Rybski, Arcaute, and Batty. 2019. Urban Scaling Laws.” Environment and Planning B: Urban Analytics and City Science.
Scharf. 2021. The Ascent of Information: Books, Bits, Genes, Machines, and Life’s Unending Algorithm.
West. 2017. Scale: The Universal Laws of Growth, Innovation, Sustainability, and the Pace of Life in Organisms, Cities, Economies, and Companies.
Wolpert, and Harper. 2025. The Computational Power of a Human Society: A New Model of Social Evolution.”
Wong, and Bartlett. 2022. Asymptotic Burnout and Homeostatic Awakening: A Possible Solution to the Fermi Paradox? Journal of The Royal Society Interface.
Xexéo, Braida, Parreiras, et al. 2024. The Economic Implications of Large Language Model Selection on Earnings and Return on Investment: A Decision Theoretic Model.”

Footnotes

  1. The “dataome” being the accumulated information layer outside biological organisms — books, architecture, computers, the internet. A coinage from Scharf (2021).↩︎