Material basis of AI
Energy, chips, water
2023-03-23 — 2025-10-07
Wherein economies of foundation models are examined and the disproportionate energy and water demands of large-scale training, including data‑centre cooling and emissions accounting, are described.
It’s complicated, important,and political, and extremely interesting. TODO.
1 Incoming
-
From Hugging Face. > The growing energy demand stems from both the increasing adoption of energy-intensive AI models, particularly large language models (LLMs), and the widespread adoption of AI in user-facing applications. However, despite these concerns, there is no clear consensus on what constitutes “AI” and how to comprehensively account for its direct and indirect environmental effects.[…] > > This project seeks to address this gap by establishing a standardized framework for reporting AI models’ energy efficiency, thereby enhancing transparency across the field.
The Electric Slide — by Packy McCormick and Sam D’Amico
America is, implicitly or explicitly, making a bet that whoever wins intelligence, in the form of AI, wins the future.
China is making a different bet: that for intelligence to truly matter, it needs energy and action.
Oviedo et al. (2025): Most prior estimates extrapolate from lab or small-batch benchmarks. Oviedo et al. show—quantitatively—that this leads to a 4–20× overestimation of per-query energy. They explicitly separate model-level, serving-level, and hardware-level contributions, and empirically characterize model utilization and PUE distributions.
Li et al. (2025): Introduces one of the first system-level design frameworks for making inference carbon-aware, incorporating dynamic routing and carbon-intensity-aware scheduling. It sets a methodological precedent for integrating emissions directly into inference system design.
Elsworth et al. (2025): Provides the first production-scale disclosure of energy intensity for Google’s AI workloads. It provides a baseline for realistic inference efficiency, contrasting with estimates from academic benchmarks that tend to be higher.
Gupta et al. (2020): A classic conceptual framing of where energy and carbon footprints arise in computing stacks. Key precursor for the “bottom-up decomposition” approach used in Oviedo et al. (2025).
Chung et al. (2025): Provides a standardized methodology for measuring inference energy under realistic serving conditions. It provides empirical validation for Oviedo et al. (2025)’s claim that non-production benchmarks overstate energy by 4–20×.
Samsi et al. (2023): One of the earliest direct measurements of LLM inference power draw. Provides raw empirical data that Oviedo et al. reinterpret to argue why scaling to production reduces per-query energy dramatically.
Patel et al. (2024): Identifies dynamic voltage/frequency scaling and oversubscription as substantial efficiency levers (20–30% gains). It’s foundational for Oviedo et al. (2025)’s datacenter-level interventions.
Yang et al. (2025): Demonstrates practical serving-side gains (1.7× throughput improvement) for long-sequence inference. It’s one of the few works I could find quantifying test-time scaling energy mitigation directly.
Stojkovic et al. (2025): Provides cluster-level modeling of inference throughput vs. energy efficiency. It forms the basis for Oviedo et al.’s proposition that deployment-scale optimization halves energy per query.
Luccioni, Viguier, and Ligozat (2023): The canonical early estimate of the inference footprint but probably excessive estimates due to non-production benchmarks; Oviedo et al. (2025) is responding to this.
Kamiya and Coroamă (2025): Provides the empirical historical context that forecasts routinely overestimate digital energy demand by ignoring compounding efficiency trends.
Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water | AP News
Can the climate survive the insatiable energy demands of the AI arms race?
James O’Donnell and Casey Crownhart: We did the math on AI’s energy footprint. Here’s the story you haven’t heard. | MIT Technology Review
- Methodological supplement: Everything you need to know about estimating AI’s energy and emissions burden | MIT Technology Review
kylemcdonald/nvidia-co2: Adds gCO2eq emissions to nvidia-smi.
