hub

How hungry is ai? benchmarking energy, water, and carbon footprint of llm inference

· 2025 · arXiv 2505.09598

27 Pith papers cite this work. Polarity classification is still indexing.

27 Pith papers citing it

read on arXiv browse 27 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer

cs.AI · 2026-06-11 · unverdicted · novelty 7.0

Otters++ realizes TTFS via measured device decay in optical synapses, uses hybrid QNN-equivalent training with noise awareness, and reports 84.17% average GLUE score with energy gains over prior spiking transformers.

Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems

cs.AI · 2026-05-20 · unverdicted · novelty 7.0

Proposes EpG and OOI metrics showing agentic workflows use 4.33x more energy per successful goal than linear baselines due to orchestration structure.

Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU

cs.DC · 2026-05-20 · conditional · novelty 7.0

LlamaWeb is a WebGPU backend for llama.cpp that uses static memory planning, tunable kernels, and templated multi-precision support to cut memory use by 29-33% and raise decode throughput by 45-69% versus prior browser frameworks on tested hardware.

LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.

Quantifying the Agreement Between Data-Influence and Data-Similarity to Understand LLM Behavior

cs.LG · 2026-06-22 · unverdicted · novelty 6.0

Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.

M\"OVE: A Holistic LLM Benchmark for the German Public Sector

cs.CL · 2026-06-11 · unverdicted · novelty 6.0

MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.

EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving

cs.CV · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

EnergyLens derives a twelve-parameter closed-form energy model via symbolic regression that achieves 88.2% top-1 configuration accuracy with 50 samples and extrapolates to unseen batch sizes and hardware.

Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures

cs.DC · 2026-04-10 · unverdicted · novelty 6.0

Watt Counts supplies over 5,000 energy measurements across 50 LLMs and 10 GPUs and shows that hardware-aware selection can reduce server-scenario energy use by up to 70 percent with little effect on user experience.

SweetSpot: An Analytical Model for Predicting Energy Efficiency of LLM Inference

cs.AI · 2026-02-05 · unverdicted · novelty 6.0

SweetSpot is an analytical model from Transformer computational and memory complexity that identifies energy minima at short-to-moderate inputs and medium outputs, achieving 1.79% MAPE on H100 GPU measurements across multiple LLMs.

EcoAssist: Embedding Sustainability into AI-Assisted Frontend Development

cs.HC · 2026-04-06 · unverdicted · novelty 5.0

EcoAssist embeds energy estimation and optimization into AI-assisted frontend coding, reducing website energy use by 13-16% in benchmarks while preserving developer productivity.

Toward a Sustainable Software Architecture Community: Evaluating ICSA's Environmental Impact

cs.SE · 2026-04-05 · unverdicted · novelty 5.0

The study provides exploratory estimates of carbon emissions from GenAI inference in ICSA papers and from the full operations of the ICSA 2025 conference.

Walking the Tightrope of LLMs for Software Development: A Practitioners' Perspective

cs.SE · 2025-11-09 · conditional · novelty 5.0

Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.

As We May Search

cs.IR · 2026-06-28 · unverdicted · novelty 4.0

Proposes local-first IR framework with experiments showing dense retrieval maintains over 91% nDCG@10 up to 100K documents on consumer hardware and 7B local models reach within 4 points of cloud baselines.

Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference

cs.LG · 2026-06-10 · unverdicted · novelty 4.0

SPSD uses a 4-bit SLM on edge to distill prompts, saving mean 99.9 tokens per call with non-inferior response quality per LLM judge on 248-prompt corpus.

Overview over the first decade of LIMITS

cs.CY · 2026-05-28 · unverdicted · novelty 4.0

An overview of the first decade of the LIMITS workshop identifies increasing mentions of degrowth, limited global expansion beyond the Global North, and a predominance of positional and observational papers.

Greening AI Inference with Accuracy and Latency-aware User Incentives

cs.LG · 2026-05-26 · unverdicted · novelty 4.0

A framework for two-tier AI service subscriptions that offer discounts for accepting lower quality or higher latency inference to reduce carbon emissions during high-intensity periods.

Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science

stat.OT · 2026-05-19 · conditional · novelty 4.0

Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.

From Cradle to Cloud: A Life Cycle Review of AI's Environmental Footprint

cs.CY · 2026-05-06 · unverdicted · novelty 4.0

A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.

Energy-Aware Routing to Large Reasoning Models

cs.AI · 2025-12-23 · unverdicted · novelty 4.0

In the critical regime for energy provisioning to large reasoning models, performance is volatility-limited, motivating variance-aware routing policies based on training and inference compute scaling laws.

The Many-Body Problem of the Data Centre

cs.AI · 2026-06-29 · unverdicted · novelty 3.0

Data centers embody AI as part of capital's body, creating a many-body problem from their non-unique form and enabling market-based comparison of intelligence value across organism-mechanism boundaries.

AI Data Centers and the Water Use Feedback Loop

cs.CY · 2026-06-19 · unverdicted · novelty 3.0

The paper formalizes the Water and AI Feedback Loop, introduces the Water Consumption Impact index, and shows water burden from AI data centers varies from 0.2% to 134% of local capacity across ten US sites.

Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey

cs.LG · 2026-06-09 · unverdicted · novelty 3.0

A survey that frames data selection, memory optimization, and compute budgeting as coupled bottlenecks in LLM training rather than isolated techniques.

Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities

cs.LG · 2026-05-05 · unverdicted · novelty 3.0 · 2 refs

Model collapse threatens AI democratization by disproportionately impacting low-resource and marginalized communities through reduced training efficiency and data distributions skewed away from distribution tails.

Scrapyard AI

cs.CY · 2026-04-09 · unverdicted · novelty 3.0

Obsolete AI models left behind by rapid development can be repurposed like scrap materials to analyze and communicate the environmental and social effects of global mining.

citing papers explorer

Showing 27 of 27 citing papers.

Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer cs.AI · 2026-06-11 · unverdicted · none · ref 2
Otters++ realizes TTFS via measured device decay in optical synapses, uses hybrid QNN-equivalent training with noise awareness, and reports 84.17% average GLUE score with energy gains over prior spiking transformers.
Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems cs.AI · 2026-05-20 · unverdicted · none · ref 20
Proposes EpG and OOI metrics showing agentic workflows use 4.33x more energy per successful goal than linear baselines due to orchestration structure.
Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU cs.DC · 2026-05-20 · conditional · none · ref 31
LlamaWeb is a WebGPU backend for llama.cpp that uses static memory planning, tunable kernels, and templated multi-precision support to cut memory use by 29-33% and raise decode throughput by 45-69% versus prior browser frameworks on tested hardware.
LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites cs.LG · 2026-05-07 · unverdicted · none · ref 1 · 2 links
LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.
Quantifying the Agreement Between Data-Influence and Data-Similarity to Understand LLM Behavior cs.LG · 2026-06-22 · unverdicted · none · ref 89
Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.
M\"OVE: A Holistic LLM Benchmark for the German Public Sector cs.CL · 2026-06-11 · unverdicted · none · ref 117
MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.
EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving cs.CV · 2026-05-11 · unverdicted · none · ref 12 · 2 links
EnergyLens derives a twelve-parameter closed-form energy model via symbolic regression that achieves 88.2% top-1 configuration accuracy with 50 samples and extrapolates to unseen batch sizes and hardware.
Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures cs.DC · 2026-04-10 · unverdicted · none · ref 15
Watt Counts supplies over 5,000 energy measurements across 50 LLMs and 10 GPUs and shows that hardware-aware selection can reduce server-scenario energy use by up to 70 percent with little effect on user experience.
SweetSpot: An Analytical Model for Predicting Energy Efficiency of LLM Inference cs.AI · 2026-02-05 · unverdicted · none · ref 11
SweetSpot is an analytical model from Transformer computational and memory complexity that identifies energy minima at short-to-moderate inputs and medium outputs, achieving 1.79% MAPE on H100 GPU measurements across multiple LLMs.
EcoAssist: Embedding Sustainability into AI-Assisted Frontend Development cs.HC · 2026-04-06 · unverdicted · none · ref 51
EcoAssist embeds energy estimation and optimization into AI-assisted frontend coding, reducing website energy use by 13-16% in benchmarks while preserving developer productivity.
Toward a Sustainable Software Architecture Community: Evaluating ICSA's Environmental Impact cs.SE · 2026-04-05 · unverdicted · none · ref 3
The study provides exploratory estimates of carbon emissions from GenAI inference in ICSA papers and from the full operations of the ICSA 2025 conference.
Walking the Tightrope of LLMs for Software Development: A Practitioners' Perspective cs.SE · 2025-11-09 · conditional · none · ref 39
Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.
As We May Search cs.IR · 2026-06-28 · unverdicted · none · ref 22
Proposes local-first IR framework with experiments showing dense retrieval maintains over 91% nDCG@10 up to 100K documents on consumer hardware and 7B local models reach within 4 points of cloud baselines.
Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference cs.LG · 2026-06-10 · unverdicted · none · ref 15
SPSD uses a 4-bit SLM on edge to distill prompts, saving mean 99.9 tokens per call with non-inferior response quality per LLM judge on 248-prompt corpus.
Overview over the first decade of LIMITS cs.CY · 2026-05-28 · unverdicted · none · ref 15
An overview of the first decade of the LIMITS workshop identifies increasing mentions of degrowth, limited global expansion beyond the Global North, and a predominance of positional and observational papers.
Greening AI Inference with Accuracy and Latency-aware User Incentives cs.LG · 2026-05-26 · unverdicted · none · ref 4
A framework for two-tier AI service subscriptions that offer discounts for accepting lower quality or higher latency inference to reduce carbon emissions during high-intensity periods.
Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science stat.OT · 2026-05-19 · conditional · none · ref 92
Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.
From Cradle to Cloud: A Life Cycle Review of AI's Environmental Footprint cs.CY · 2026-05-06 · unverdicted · none · ref 48
A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.
Energy-Aware Routing to Large Reasoning Models cs.AI · 2025-12-23 · unverdicted · none · ref 8
In the critical regime for energy provisioning to large reasoning models, performance is volatility-limited, motivating variance-aware routing policies based on training and inference compute scaling laws.
The Many-Body Problem of the Data Centre cs.AI · 2026-06-29 · unverdicted · none · ref 18
Data centers embody AI as part of capital's body, creating a many-body problem from their non-unique form and enabling market-based comparison of intelligence value across organism-mechanism boundaries.
AI Data Centers and the Water Use Feedback Loop cs.CY · 2026-06-19 · unverdicted · none · ref 8
The paper formalizes the Water and AI Feedback Loop, introduces the Water Consumption Impact index, and shows water burden from AI data centers varies from 0.2% to 134% of local capacity across ten US sites.
Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey cs.LG · 2026-06-09 · unverdicted · none · ref 5
A survey that frames data selection, memory optimization, and compute budgeting as coupled bottlenecks in LLM training rather than isolated techniques.
Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities cs.LG · 2026-05-05 · unverdicted · none · ref 12 · 2 links
Model collapse threatens AI democratization by disproportionately impacting low-resource and marginalized communities through reduced training efficiency and data distributions skewed away from distribution tails.
Scrapyard AI cs.CY · 2026-04-09 · unverdicted · none · ref 12
Obsolete AI models left behind by rapid development can be repurposed like scrap materials to analyze and communicate the environmental and social effects of global mining.
Why AI Slop Matters, but Not Like That cs.CY · 2026-06-10 · unverdicted · none · ref 15
A response paper arguing that research on AI slop requires contextual, culturally-grounded analysis of social function and aesthetic value rather than the approach taken in the critiqued work.
Cloud-native and Distributed Systems for Efficient and Scalable Large Language Models -- A Research Agenda cs.DC · 2026-04-19 · unverdicted · none · ref 60
This research agenda argues that cloud-native architectures, microservices, autoscaling, and emerging trends like serverless inference and federated learning are required to make large language models efficient and scalable.
Synthetic Reflections on Resource Extraction cs.CY · 2026-02-10 · unverdicted · none · ref 22
A bespoke Urban Dwelling and Mining Index is introduced to improve multimodal AI models' assessment of mining operations' spatial distribution from satellite imagery.

How hungry is ai? benchmarking energy, water, and carbon footprint of llm inference

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer