Economics of foundation models
Microeconomic of compute
March 23, 2023 — April 14, 2025
1 Returns to scale for large AI firms
Deepseek is a Chinese company, not a community. But they seem to be changing the game in terms of cost, accessibility, and openness of AI models. TBD.
- DeepSeek on X: “🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & commercialise freely! 🌐 Website & API are live now!”
- deepseek-ai (DeepSeek)
- Paper page - DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Leaked Google document: “We Have No Moat, And Neither Does OpenAI” asserts that large corporates are concerned that LLMs do not provide sufficient return on capital.
2 Spamularity, dark forest, textpocalypse
See Spamularity.
3 PR, hype, marketing
George Hosu, in a short aside, highlights the incredible marketing advantage of AI:
People that failed to lift a finger to integrate better-than-doctors or work-with-doctors supervised medical models for half a century are stoked at a chatbot being as good as an average doctor and can’t wait to get it to triage patients
Google’s Bard was undone on day two by an inaccurate response in the demo video where it suggested that the James Webb Space Telescope would take the first images of exoplanets.
This sounds like something the JWST would do but it’s not at all true.
So one tweet from an astrophysicist sank Alphabet’s value by 9%. This says a lot about how
LLMs are like being at the pub with friends, it can say things that sound plausible and true enough, and no one really needs to check because who cares?
Except we do because this is science, not a lads’ night out, and
the insane speculative volatility of this AI bubble that the hype is so razor thin it can be undermined by a tweet with 44 likes.
I had a wonder if there’s any exploration of the ‘thickness’ of hype. Jack Stilgoe suggested looking at Borup et al. (2006) which is evergreen but I feel like there’s something about the resilience of hype:
Like crypto was/is pretty thin in the scheme of things. High levels of hype but frenetic, unstable and quick to collapse.
AI has pretty consistent if pulsating hype gradually growing over the years while something like nuclear fusion is super thick (at least in the popular imagination) – remaining through decades of not-quite-ready and grasping the slightest indication of success.
I don’t know, if there’s nothing specifically on this, maybe I should write it one day.
4 Empirical frontier models cash money costs
- DeepSeek V3 and the cost of frontier AI models
- DeepSeek FAQ – Stratechery by Ben Thompson
- Observations About LLM Inference Pricing | MIGRI TGT
- AI Model & API Providers Analysis | Artificial Analysis
- Data on the Trajectory of AI | Epoch AI Database | Epoch AI
- Algorithmic Progress in Language Models | Epoch AI
5 Democratisation of AI
6 Art and creativity
For now, see timeless works of art.
7 Data sovereignty
See data sovereignty.
8 AI tech soap opera
9 Incoming
I would prefer it as “AGI is less likely to abolish human labour value than you previously thought” rather than a blanket statement, but YMMV. * Ilya Sutskever: “Sequence to sequence learning with neural networks: what a decade” * Can the climate survive the insatiable energy demands of the AI arms race? * Why Quora isn’t useful anymore: A.I. came for the best site on the internet. * What Will Transformers Transform? – Rodney Brooks * Gradient Dissent, a list of reasons that large backpropagation-trained networks might be worrisome. Some interesting points in there, and some hyperbole. Also: If it were true that externalities come from backprop networks (i.e. that they are a kind of methodological pollution that produces private benefits but public costs) then what kind of mechanisms should disincentivise them? * C&C Against Predictive Optimisation * Stanford CRFM
In this post, we evaluate whether major foundation model providers currently comply with these draft requirements and find that they largely do not. Foundation model providers rarely disclose adequate information regarding the data, compute, and deployment of their models as well as the key characteristics of the models themselves. In particular, foundation model providers generally do not comply with draft requirements to describe the use of copyrighted training data, the hardware used and emissions produced in training, and how they evaluate and test models. As a result, we recommend that policymakers prioritise transparency, informed by the AI Act’s requirements. Our assessment demonstrates that it is currently feasible for foundation model providers to comply with the AI Act, and that disclosure related to foundation models’ development, use, and performance would improve transparency in the entire ecosystem.
- I Do Not Think It Means What You Think It Means: Artificial Intelligence, Cognitive Work & Scale
- Groq Inference Tokenomics: Speed, But At What Cost?
- Invasive Diffusion: How one unwilling illustrator found herself turned into an AI model
- Cyborg Periods: There will be multiple AI transitions
- Bruce Schneier, On the Need for an AI Public Option
- Ecosystem Graphs for Foundation Models
- I Went to the Premiere of the First Commercially Streaming AI-Generated Movies
- Lower AI Costs Will Drive Innovation, Efficiency, and Adoption
- Spirals of Delusion: How AI Distorts Decision-Making and Makes Dictators More Dangerous (not convinced tbh)
- The Ghosts in the Machine, by Liz Pelly