Material basis of AI

Energy, chips, water

2023-03-23 — 2025-10-07

Wherein economies of foundation models are examined and the disproportionate energy and water demands of large-scale training, including data‑centre cooling and emissions accounting, are described.

agents

AI safety

bounded compute

collective knowledge

distributed

economics

edge computing

extended self

faster pussycat

incentive mechanisms

innovation

language

machine learning

neural nets

NLP

technology

when to compute

It’s complicated, important,and political, and extremely interesting. TODO.

1 Incoming

AI Energy Score | AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.

From Hugging Face. > The growing energy demand stems from both the increasing adoption of energy-intensive AI models, particularly large language models (LLMs), and the widespread adoption of AI in user-facing applications. However, despite these concerns, there is no clear consensus on what constitutes “AI” and how to comprehensively account for its direct and indirect environmental effects.[…] > > This project seeks to address this gap by establishing a standardized framework for reporting AI models’ energy efficiency, thereby enhancing transparency across the field.
The Electric Slide — by Packy McCormick and Sam D’Amico

America is, implicitly or explicitly, making a bet that whoever wins intelligence, in the form of AI, wins the future.

China is making a different bet: that for intelligence to truly matter, it needs energy and action.
Oviedo et al. (2025): Most prior estimates extrapolate from lab or small-batch benchmarks. Oviedo et al. show—quantitatively—that this leads to a 4–20× overestimation of per-query energy. They explicitly separate model-level, serving-level, and hardware-level contributions, and empirically characterize model utilization and PUE distributions.
Li et al. (2025): Introduces one of the first system-level design frameworks for making inference carbon-aware, incorporating dynamic routing and carbon-intensity-aware scheduling. It sets a methodological precedent for integrating emissions directly into inference system design.
Elsworth et al. (2025): Provides the first production-scale disclosure of energy intensity for Google’s AI workloads. It provides a baseline for realistic inference efficiency, contrasting with estimates from academic benchmarks that tend to be higher.
Gupta et al. (2020): A classic conceptual framing of where energy and carbon footprints arise in computing stacks. Key precursor for the “bottom-up decomposition” approach used in Oviedo et al. (2025).
Chung et al. (2025): Provides a standardized methodology for measuring inference energy under realistic serving conditions. It provides empirical validation for Oviedo et al. (2025)’s claim that non-production benchmarks overstate energy by 4–20×.
Samsi et al. (2023): One of the earliest direct measurements of LLM inference power draw. Provides raw empirical data that Oviedo et al. reinterpret to argue why scaling to production reduces per-query energy dramatically.
Patel et al. (2024): Identifies dynamic voltage/frequency scaling and oversubscription as substantial efficiency levers (20–30% gains). It’s foundational for Oviedo et al. (2025)’s datacenter-level interventions.
Yang et al. (2025): Demonstrates practical serving-side gains (1.7× throughput improvement) for long-sequence inference. It’s one of the few works I could find quantifying test-time scaling energy mitigation directly.
Stojkovic et al. (2025): Provides cluster-level modeling of inference throughput vs. energy efficiency. It forms the basis for Oviedo et al.’s proposition that deployment-scale optimization halves energy per query.
Luccioni, Viguier, and Ligozat (2023): The canonical early estimate of the inference footprint but probably excessive estimates due to non-production benchmarks; Oviedo et al. (2025) is responding to this.
Kamiya and Coroamă (2025): Provides the empirical historical context that forecasts routinely overestimate digital energy demand by ignoring compounding efficiency trends.
How much energy does ChatGPT use? | Epoch AI
Cost of Computing in Coal
Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water | AP News
Can the climate survive the insatiable energy demands of the AI arms race?
Data centers don’t harm water access at all, anywhere
I Was Wrong About Data Center Water Consumption
Microsoft zero water cooling
What’s the carbon footprint of using ChatGPT?
James O’Donnell and Casey Crownhart: We did the math on AI’s energy footprint. Here’s the story you haven’t heard. | MIT Technology Review
- Methodological supplement: Everything you need to know about estimating AI’s energy and emissions burden | MIT Technology Review
kylemcdonald/nvidia-co2: Adds gCO2eq emissions to nvidia-smi.

2 References

Agency. 2025. “Energy and AI.”

Chung, Liu, Ma, et al. 2025. “The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization.”

Elsworth, Huang, Patterson, et al. 2025. “Measuring the Environmental Impact of Delivering AI at Google Scale.” arXiv:2508.15734.

Gupta, Kim, Lee, et al. 2020. “Chasing Carbon: The Elusive Environmental Footprint of Computing.”

Kamiya, and Coroamă. 2025. “Data Centre Energy Use: Critical Review of Models and Results.”

Li, Hu, Choukse, et al. 2025. “EcoServe: Designing Carbon-Aware AI Inference Systems.”

Luccioni, Viguier, and Ligozat. 2023. “Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model.” Journal of Machine Learning Research.

Oviedo, Kazhamiaka, Choukse, et al. 2025. “Energy Use of AI Inference: Efficiency Pathways and Test-Time Compute.”

Patel, Choukse, Zhang, et al. 2024. “Characterizing Power Management Opportunities for LLMs in the Cloud.” In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3. ASPLOS ’24.

Samsi, Zhao, McDonald, et al. 2023. “From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference.”

Stojkovic, Zhang, Goiri, et al. 2025. “DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency.” In 2025 IEEE International Symposium on High Performance Computer Architecture (HPCA).

Yang, Guo, Tang, et al. 2025. “LServe: Efficient Long-Sequence LLM Serving with Unified Sparse Attention.”