Otters++ realizes TTFS via measured device decay in optical synapses, uses hybrid QNN-equivalent training with noise awareness, and reports 84.17% average GLUE score with energy gains over prior spiking transformers.
hub
How hungry is ai? benchmarking energy, water, and carbon footprint of llm inference
26 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
support 1representative citing papers
Proposes EpG and OOI metrics showing agentic workflows use 4.33x more energy per successful goal than linear baselines due to orchestration structure.
LlamaWeb is a WebGPU backend for llama.cpp that uses static memory planning, tunable kernels, and templated multi-precision support to cut memory use by 29-33% and raise decode throughput by 45-69% versus prior browser frameworks on tested hardware.
LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.
Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.
MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.
EnergyLens derives a twelve-parameter closed-form energy model via symbolic regression that achieves 88.2% top-1 configuration accuracy with 50 samples and extrapolates to unseen batch sizes and hardware.
Watt Counts supplies over 5,000 energy measurements across 50 LLMs and 10 GPUs and shows that hardware-aware selection can reduce server-scenario energy use by up to 70 percent with little effect on user experience.
SweetSpot is an analytical model from Transformer computational and memory complexity that identifies energy minima at short-to-moderate inputs and medium outputs, achieving 1.79% MAPE on H100 GPU measurements across multiple LLMs.
EcoAssist embeds energy estimation and optimization into AI-assisted frontend coding, reducing website energy use by 13-16% in benchmarks while preserving developer productivity.
The study provides exploratory estimates of carbon emissions from GenAI inference in ICSA papers and from the full operations of the ICSA 2025 conference.
Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.
Proposes local-first IR framework with experiments showing dense retrieval maintains over 91% nDCG@10 up to 100K documents on consumer hardware and 7B local models reach within 4 points of cloud baselines.
SPSD uses a 4-bit SLM on edge to distill prompts, saving mean 99.9 tokens per call with non-inferior response quality per LLM judge on 248-prompt corpus.
An overview of the first decade of the LIMITS workshop identifies increasing mentions of degrowth, limited global expansion beyond the Global North, and a predominance of positional and observational papers.
A framework for two-tier AI service subscriptions that offer discounts for accepting lower quality or higher latency inference to reduce carbon emissions during high-intensity periods.
Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.
A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.
In the critical regime for energy provisioning to large reasoning models, performance is volatility-limited, motivating variance-aware routing policies based on training and inference compute scaling laws.
Data centers embody AI as part of capital's body, creating a many-body problem from their non-unique form and enabling market-based comparison of intelligence value across organism-mechanism boundaries.
A survey that frames data selection, memory optimization, and compute budgeting as coupled bottlenecks in LLM training rather than isolated techniques.
Model collapse threatens AI democratization by disproportionately impacting low-resource and marginalized communities through reduced training efficiency and data distributions skewed away from distribution tails.
Obsolete AI models left behind by rapid development can be repurposed like scrap materials to analyze and communicate the environmental and social effects of global mining.
A response paper arguing that research on AI slop requires contextual, culturally-grounded analysis of social function and aesthetic value rather than the approach taken in the critiqued work.
citing papers explorer
-
Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer
Otters++ realizes TTFS via measured device decay in optical synapses, uses hybrid QNN-equivalent training with noise awareness, and reports 84.17% average GLUE score with energy gains over prior spiking transformers.
-
Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems
Proposes EpG and OOI metrics showing agentic workflows use 4.33x more energy per successful goal than linear baselines due to orchestration structure.
-
Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU
LlamaWeb is a WebGPU backend for llama.cpp that uses static memory planning, tunable kernels, and templated multi-precision support to cut memory use by 29-33% and raise decode throughput by 45-69% versus prior browser frameworks on tested hardware.
-
LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites
LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.
-
Quantifying the Agreement Between Data-Influence and Data-Similarity to Understand LLM Behavior
Data-similarity and data-influence produce significantly overlapping rankings of training documents for LLM outputs, with asymmetry allowing a favorable cost-accuracy trade-off.
-
M\"OVE: A Holistic LLM Benchmark for the German Public Sector
MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.
-
EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving
EnergyLens derives a twelve-parameter closed-form energy model via symbolic regression that achieves 88.2% top-1 configuration accuracy with 50 samples and extrapolates to unseen batch sizes and hardware.
-
Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures
Watt Counts supplies over 5,000 energy measurements across 50 LLMs and 10 GPUs and shows that hardware-aware selection can reduce server-scenario energy use by up to 70 percent with little effect on user experience.
-
SweetSpot: An Analytical Model for Predicting Energy Efficiency of LLM Inference
SweetSpot is an analytical model from Transformer computational and memory complexity that identifies energy minima at short-to-moderate inputs and medium outputs, achieving 1.79% MAPE on H100 GPU measurements across multiple LLMs.
-
EcoAssist: Embedding Sustainability into AI-Assisted Frontend Development
EcoAssist embeds energy estimation and optimization into AI-assisted frontend coding, reducing website energy use by 13-16% in benchmarks while preserving developer productivity.
-
Toward a Sustainable Software Architecture Community: Evaluating ICSA's Environmental Impact
The study provides exploratory estimates of carbon emissions from GenAI inference in ICSA papers and from the full operations of the ICSA 2025 conference.
-
Walking the Tightrope of LLMs for Software Development: A Practitioners' Perspective
Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.
-
As We May Search
Proposes local-first IR framework with experiments showing dense retrieval maintains over 91% nDCG@10 up to 100K documents on consumer hardware and 7B local models reach within 4 points of cloud baselines.
-
Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference
SPSD uses a 4-bit SLM on edge to distill prompts, saving mean 99.9 tokens per call with non-inferior response quality per LLM judge on 248-prompt corpus.
-
Overview over the first decade of LIMITS
An overview of the first decade of the LIMITS workshop identifies increasing mentions of degrowth, limited global expansion beyond the Global North, and a predominance of positional and observational papers.
-
Greening AI Inference with Accuracy and Latency-aware User Incentives
A framework for two-tier AI service subscriptions that offer discounts for accepting lower quality or higher latency inference to reduce carbon emissions during high-intensity periods.
-
Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science
Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.
-
From Cradle to Cloud: A Life Cycle Review of AI's Environmental Footprint
A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.
-
Energy-Aware Routing to Large Reasoning Models
In the critical regime for energy provisioning to large reasoning models, performance is volatility-limited, motivating variance-aware routing policies based on training and inference compute scaling laws.
-
The Many-Body Problem of the Data Centre
Data centers embody AI as part of capital's body, creating a many-body problem from their non-unique form and enabling market-based comparison of intelligence value across organism-mechanism boundaries.
-
Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey
A survey that frames data selection, memory optimization, and compute budgeting as coupled bottlenecks in LLM training rather than isolated techniques.
-
Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities
Model collapse threatens AI democratization by disproportionately impacting low-resource and marginalized communities through reduced training efficiency and data distributions skewed away from distribution tails.
-
Scrapyard AI
Obsolete AI models left behind by rapid development can be repurposed like scrap materials to analyze and communicate the environmental and social effects of global mining.
-
Why AI Slop Matters, but Not Like That
A response paper arguing that research on AI slop requires contextual, culturally-grounded analysis of social function and aesthetic value rather than the approach taken in the critiqued work.
-
Cloud-native and Distributed Systems for Efficient and Scalable Large Language Models -- A Research Agenda
This research agenda argues that cloud-native architectures, microservices, autoscaling, and emerging trends like serverless inference and federated learning are required to make large language models efficient and scalable.
-
Synthetic Reflections on Resource Extraction
A bespoke Urban Dwelling and Mining Index is introduced to improve multimodal AI models' assessment of mining operations' spatial distribution from satellite imagery.