BIRDS: Characterizing and Understanding Biodiversity Impact of Large Language Model Serving

Tianyao Shi; Yi Ding

arxiv: 2605.27480 · v2 · pith:YUAPJXLQnew · submitted 2026-05-26 · 🧬 q-bio.OT · cs.AI· cs.CY

BIRDS: Characterizing and Understanding Biodiversity Impact of Large Language Model Serving

Tianyao Shi , Yi Ding This is my paper

Pith reviewed 2026-06-29 14:35 UTC · model grok-4.3

classification 🧬 q-bio.OT cs.AIcs.CY

keywords biodiversity impactLLM servingenvironmental footprint of AIquality-normalized metricsrequest-level functional unitsoperational and embodied impactsAI sustainabilityecosystem damage

0 comments

The pith

Biodiversity impact from large language model serving accumulates at scale and can be balanced against response quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents BIRDS as a framework to measure biodiversity effects of LLM serving at the level of individual requests. It separates operational impacts from running the models and embodied impacts from the hardware that supports them. A new metric called Quality-Normalized Biodiversity Impact combines the ecological cost with how good the model's output is. A reader would care because LLM use is expanding quickly and its effects on ecosystems go beyond carbon emissions to include direct damage to living systems. The work identifies concrete choices in how models are served that can lower this impact.

Core claim

BIRDS defines request-level functional units, quantifies operational and embodied biodiversity impact of LLM serving, and introduces Quality-Normalized Biodiversity Impact to jointly analyze ecological impact and response quality. Across diverse workloads, models, GPUs, and regions, biodiversity impact accumulates at scale and exposes actionable quality-aware serving tradeoffs.

What carries the argument

The BIRDS framework, which uses request-level functional units to track operational and embodied biodiversity pathways, together with the Quality-Normalized Biodiversity Impact metric that normalizes those impacts by response quality.

If this is right

Operators can select serving configurations that lower biodiversity cost for a given level of output quality.
Different combinations of models and hardware produce measurably different biodiversity footprints that can be ranked.
Per-request impacts, though small, sum to large ecosystem effects when request volume grows.
Quality-aware decisions allow tradeoffs that reduce ecological load without sacrificing usable answers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Data-center siting decisions could incorporate biodiversity maps to avoid high-impact regions.
The same request-level accounting might apply to other large-scale computing workloads beyond language models.
Platforms could expose the QNBI value to users or regulators as an additional performance signal.
Long-term monitoring of real ecosystems near serving facilities would test whether the modeled pathways match observed outcomes.

Load-bearing premise

Biodiversity impact can be accurately quantified at the request level using defined functional units for operational and embodied pathways without major unaccounted measurement errors or biases.

What would settle it

An independent measurement campaign at LLM data centers that records actual changes in local species populations or habitat quality and compares the totals to the summed per-request estimates produced by BIRDS.

Figures

Figures reproduced from arXiv: 2605.27480 by Tianyao Shi, Yi Ding.

**Figure 2.** Figure 2: Overview of the three-step modeling procedure of BIRDS. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of FU- and token-level BIs (BI [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: BIfu and quality Q(θ) for everyday chat workload serving. Bars show BIfu for the most energy-efficient serving configuration of each model, and the black line shows quality score. Models are grouped by family and ordered by size. For Qwen3 models with Instruct / Thinking variants, we report the Instruct models’ results here. 1 10 100 Model size (B params) 1e-14 1e-13 Q N B I ( θ ) ( s p e c i e s ⋅ y r ) L… view at source ↗

**Figure 6.** Figure 6: QNBI for daily chat workload serving. Each [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 8.** Figure 8: Effect of reasoning mode on MMLU-Pro serving for Qwen3 Instruct and Thinking variants. Left: BIfu and quality score Q(θ). Middle: response length distribution. Right: QNBI. 25 50 Req/s 1 10 Q N B I ( s p e c i e s ¢ y r )×10 −14 25 50 Req/s 10 10 2 10 3 10 4 P90 TTFT (ms) 25 50 Req/s 10 10 2 10 3 P90 TPOT (ms) E4B (TP1) 26B-A4B (TP1) 31B (TP4) [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Traffic-load effect on QNBI and latency for [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 11.** Figure 11: BIfu and quality score for CrossCodeEval. 1B 3B 8B 70B 0.6B 1.7B 4B 8B 14B 30BA3B 32B 235BA22B E2B E4B 26BA4B 31B 1e-14 1e-13 B I fu ( s p e c i e s ¢ y r ) Llama-3.1&3.2 Qwen-3 Gemma-4 0.0 0.2 0.4 Q ( θ) [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: BIfu and quality score for RepoBench. 3B 8B 70B 20B 120B 4B 8B 14B 30BA3B 32B 235BA22B E4B 26BA4B 31B 1e-14 1e-13 1e-12 B I f u ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 0.2 0.5 0.8 Q ( θ ) [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

**Figure 13.** Figure 13: BIfu and quality score for MMLU-Pro. 3B 8B 70B 20B 120B 4B 8B 14B 30BA3B 32B 235BA22B E4B 26BA4B 31B 1e-14 1e-13 1e-12 B I f u ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 0.2 0.4 0.6 Q ( θ ) [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: BIfu and quality score for SuperGPQA. 3B 8B 70B 20B 120B 4B 8B 14B 30BA3B 32B 235BA22B E4B 26BA4B 31B 1e-13 1e-12 B I f u ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 0.3 0.4 Q ( θ ) [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗

**Figure 15.** Figure 15: BIfu and quality score for LongBench long-output summarization. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗

**Figure 16.** Figure 16: BIfu and quality score for LongBench medium-output summarization. 3B 8B 70B 20B 120B 4B 8B 14B 30BA3B 32B 235BA22B E4B 26BA4B 31B 1e-13 1e-12 B I f u ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 0.2 0.3 0.4 Q ( θ ) [PITH_FULL_IMAGE:figures/full_fig_p019_16.png] view at source ↗

**Figure 17.** Figure 17: BIfu and quality score for LongBench medium-answer RAG QA. 3B 8B 70B 20B 120B 4B 8B 14B 30BA3B 32B 235BA22B E4B 26BA4B 31B 1e-13 B I f u ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 0.3 0.4 0.5 0.6 Q ( θ ) [PITH_FULL_IMAGE:figures/full_fig_p019_17.png] view at source ↗

**Figure 18.** Figure 18: BIfu and quality score for LongBench short-answer document QA. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗

**Figure 23.** Figure 23: QNBI for LongBench long-output summarization. 10 100 Model size (B params) 1e-13 1e-12 Q N B I ( θ ) ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 Dense MoE [PITH_FULL_IMAGE:figures/full_fig_p020_23.png] view at source ↗

**Figure 24.** Figure 24: QNBI for LongBench medium-output summarization. 10 100 Model size (B params) 1e-13 1e-12 Q N B I ( θ ) ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 Dense MoE [PITH_FULL_IMAGE:figures/full_fig_p020_24.png] view at source ↗

**Figure 25.** Figure 25: QNBI for LongBench medium-answer RAG QA. 10 100 Model size (B params) 1e-13 Q N B I ( θ ) ( s p e c i e s ⋅ y r ) Llama-3.1&3.2 GPT-OSS Qwen-3 Gemma-4 Dense MoE [PITH_FULL_IMAGE:figures/full_fig_p020_25.png] view at source ↗

**Figure 26.** Figure 26: QNBI for LongBench short-answer document QA. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_26.png] view at source ↗

**Figure 27.** Figure 27: GPU-generation effect on QNBI for Gemma [PITH_FULL_IMAGE:figures/full_fig_p021_27.png] view at source ↗

read the original abstract

Large language model (LLM) serving creates environmental impacts beyond carbon and water, including ecosystem damage through biodiversity-related pathways. We present BIRDS, a framework for Biodiversity Impact of Request-Driven LLM Serving. BIRDS defines request-level functional units, quantifies operational and embodied biodiversity impact, and introduces Quality-Normalized Biodiversity Impact (QNBI) to jointly analyze ecological impact and response quality. Across diverse workloads, models, GPUs, and regions, BIRDS reveals that biodiversity impact accumulates at scale and exposes actionable quality-aware serving tradeoffs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BIRDS introduces a request-level biodiversity metric for LLM serving that expands beyond carbon, but the abstract gives no methods or validation so the claims stay untestable.

read the letter

The paper's main move is to treat biodiversity as a distinct impact category for LLM inference and to build a framework around request-level functional units plus a quality-normalized metric. That is new relative to the usual carbon or water footprints.

It does a clean job of separating operational and embodied pathways and of tying the metric to response quality so that serving decisions can be compared on both axes. The claim that impacts accumulate at scale and that quality-aware tradeoffs exist follows directly from the setup they describe.

The soft spot is that none of the quantification steps are shown. There is no description of the data sources for biodiversity damage factors, how the functional units are actually computed, what the error bars look like, or any cross-check against field measurements. Without those pieces the central numbers cannot be evaluated, and the weakest assumption—that request-level biodiversity can be measured accurately—remains unexamined.

The free parameters in the quality normalization step also need scrutiny to confirm they do not drive the results. The citation pattern looks light because the work is positioned as a first attempt at this particular framing.

This is for people already working on environmental metrics for AI systems who want to see the biodiversity angle added. A reader who needs reproducible methods or validated numbers will not get much yet.

I would send it to peer review. The topic is timely and the framing is coherent on its own terms; referees can check whether the actual calculations hold up once the methods are visible.

Referee Report

1 major / 0 minor

Summary. The paper presents the BIRDS framework for quantifying biodiversity impact of request-driven LLM serving. It defines request-level functional units, measures operational and embodied pathways, introduces the Quality-Normalized Biodiversity Impact (QNBI) metric to jointly consider impact and response quality, and applies the framework across workloads, models, GPUs, and regions to conclude that biodiversity impact accumulates at scale while exposing quality-aware serving tradeoffs.

Significance. If the request-level quantifications hold, the work is significant for extending LLM environmental assessments beyond carbon and water to biodiversity pathways. The QNBI metric is a novel contribution enabling joint analysis of ecological cost and quality, and the multi-setting evaluation provides concrete evidence of scale effects and tradeoffs that could guide sustainable serving policies.

major comments (1)

[QNBI definition] QNBI definition: the quality normalization parameters are free parameters; without sensitivity analysis showing that the reported quality-aware tradeoffs remain stable under reasonable perturbations of these parameters, the central claim that BIRDS exposes actionable tradeoffs is not yet load-bearing.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the QNBI metric. We address the point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [QNBI definition] QNBI definition: the quality normalization parameters are free parameters; without sensitivity analysis showing that the reported quality-aware tradeoffs remain stable under reasonable perturbations of these parameters, the central claim that BIRDS exposes actionable tradeoffs is not yet load-bearing.

Authors: We agree that the quality normalization parameters in the QNBI definition are tunable and that explicit sensitivity analysis is required to substantiate the robustness of the quality-aware tradeoffs. In the revised manuscript we will add a dedicated sensitivity analysis subsection. This analysis will perturb the normalization parameters over ranges consistent with observed variability in the underlying quality metrics (e.g., factors from 0.5× to 2.0× the baseline values) and will demonstrate that the relative ordering of serving configurations and the identification of actionable tradeoffs remain stable. The new subsection will be placed after the main QNBI results and will directly support the central claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces BIRDS as a new framework that defines request-level functional units, quantifies operational and embodied biodiversity impacts via independent pathways, and defines QNBI for joint analysis. No equations, derivations, or self-citations are presented that reduce any claimed result to a fitted input or prior self-referential definition by construction. The quantification approach is presented as an external measurement method without internal circular reduction, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The framework rests on domain assumptions about biodiversity quantification and introduces a new normalized metric; no free parameters or invented entities beyond QNBI are detailed in the abstract.

free parameters (1)

Quality normalization parameters in QNBI
Parameters likely chosen or fitted to balance quality and impact across workloads, though not specified.

axioms (1)

domain assumption Biodiversity impact can be quantified using request-level functional units for operational and embodied effects
This underpins the entire BIRDS definition and QNBI calculation.

invented entities (1)

Quality-Normalized Biodiversity Impact (QNBI) no independent evidence
purpose: To jointly analyze ecological impact and response quality
New metric introduced to combine impact and quality metrics

pith-pipeline@v0.9.1-grok · 5951 in / 1172 out tokens · 66590 ms · 2026-06-29T14:35:31.312816+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 15 canonical work pages · 7 internal anchors

[1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
[3]

Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, and Juanzi Li. 2023. https://arxiv.org/abs/2308.14508 Longbench: A bilingual, multitask benchmark for long context understanding . Preprint, arXiv:2308.14508

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Arlene Blum. 2024. https://www.forbes.com/sites/arleneblum/2024/12/17/chip-manufacturing-shortcuts-harm-our-health-and-environment/ Chip manufacturing shortcuts harm our health and environment . Forbes

2024
[5]

Andreas Busa, Malcolm Hegeman, Jeff Vickers, Natalia Duque-Ciceri, and Constantin Herrmann. 2019. https://www.delltechnologies.com/asset/en-us/products/servers/technical-support/Full_LCA_Dell_R740.pdf Life cycle assessment of dell r740 . Technical report, Dell Technologies. Accessed 19 May 2025

2019
[6]

Bradley J Cardinale, J Emmett Duffy, Andrew Gonzalez, David U Hooper, Charles Perrings, Patrick Venail, Anita Narwani, Georgina M Mace, David Tilman, David A Wardle, and 1 others. 2012. Biodiversity loss and its impact on humanity. Nature, 486(7401):59--67

2012
[7]

Aaron Chatterji, Thomas Cunningham, David J Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman. 2025. How people use chatgpt. Technical report, National Bureau of Economic Research

2025
[8]

Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. https://arxiv.org/pdf/2310.11248.pdf Crosscodeeval: A diverse and multilingual benchmark for cross-file code completion . In Thirty-seventh Conference on Neural Information Proces...

work page arXiv 2023
[9]

Yi Ding and Tianyao Shi. 2024. Sustainable llm serving: Environmental implications, challenges, and opportunities. In 2024 IEEE 15th International Green and Sustainable Computing Conference (IGSC), pages 37--38. IEEE

2024
[10]

European Commission Joint Research Centre (JRC) and Netherlands Environmental Assessment Agency . 2024. EDGAR\_2024\_GHG : Emissions database for global atmospheric research. https://edgar.jrc.ec.europa.eu/dataset_ghg2024. Accessed: 2025-05-19

2024
[11]

Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Prateek Sharma, Fan Chen, and Lei Jiang. 2024. Llmcarbon: Modeling the end-to-end carbon footprint of large language models. In International Conference on Learning Representations, volume 2024, pages 24727--24741

2024
[12]

Sophia Falk, David Ekchajzer, Thibault Pirson, Etienne Lees-Perasso, Augustin Wattiez, Lisa Biber-Freudenberger, Sasha Luccioni, and Aimee van Wynsberghe. 2025. More than carbon: Cradle-to-grave environmental impacts of genai training on the nvidia a100 gpu. arXiv preprint arXiv:2509.00093

work page arXiv 2025
[13]

u rgen Kl \

Matthias Finkbeiner, Atsushi Inaba, Reginald Tan, Kim Christiansen, and Hans-J \"u rgen Kl \"u ppel. 2006. https://doi.org/10.1065/lca2006.02.002 The new international standards for life cycle assessment: Iso 14040 and iso 14044 . The international journal of life cycle assessment, 11:80--85

work page doi:10.1065/lca2006.02.002 2006
[14]

Google DeepMind . 2026. Gemma 4 Model Card . https://ai.google.dev/gemma/docs/core/model_card_4. Accessed: 2026-05-14

2026
[15]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[16]

GreenDelta . 2026. https://nexus.openlca.org/database/ELCD openlca nexus: ELCD database . GreenDelta GmbH. Accessed: 2026-05-22

2026
[17]

Pranjol Sen Gupta, Md Rajib Hossen, Pengfei Li, Shaolei Ren, and Mohammad A Islam. 2024. A dataset for research on water sustainability. In Proceedings of the 15th ACM International Conference on Future and Sustainable Energy Systems, pages 442--446

2024
[18]

Wu, Act: designing sustainable computer systems with an architectural carbon modeling tool, in: ACM/IEEE 49th Annu

Udit Gupta, Mariam Elgamal, Gage Hills, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, and Carole-Jean Wu. 2022. https://doi.org/10.1145/3470496.3527408 ACT : Designing sustainable computer systems with an architectural carbon modeling tool . In ISCA

work page doi:10.1145/3470496.3527408 2022
[19]

Huijbregts, Zoran J.N

Mark A.J. Huijbregts, Zoran J.N. Steinmann, Pieter M.F. Elshout, Geert Stam, Francesca Verones, Marisa Vieira, Anne Hollander, Michiel Zijp, and Rosalie van Zelm. 2016. https://www.rivm.nl/publicaties/recipe-2016-harmonized-life-cycle-impact-assessment-method-at-midpoint-and-endpoint-level ReCiPe 2016: A Harmonized Life Cycle Impact Assessment Method at M...

2016
[20]

IPBES . 2019. https://doi.org/10.5281/zenodo.3553579 Summary for policymakers of the global assessment report on biodiversity and ecosystem services of the intergovernmental science-policy platform on biodiversity and ecosystem services . Technical report, IPBES secretariat, Bonn, Germany. S. D \'i az, J. Settele, E. S. Brond \'i zio, H. T. Ngo, M. Gu \`e...

work page doi:10.5281/zenodo.3553579 2019
[21]

Scotten W Jones. 2023. https://doi.org/10.1109/IEDM45741.2023.10413715 Modeling 300mm wafer fab carbon emissions . In 2023 International Electron Devices Meeting (IEDM), pages 1--4. IEEE

work page doi:10.1109/iedm45741.2023.10413715 2023
[22]

Walter Kl \"o pffer and Birgit Grahl. 2014. Life cycle assessment (LCA): a guide to best practice. John Wiley & Sons

2014
[23]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles

2023
[24]

Baolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, and Devesh Tiwari. 2023. https://doi.org/10.1145/3581784.3607035 Toward sustainable hpc: Carbon footprint estimation and environmental implications of hpc systems . In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages...

work page doi:10.1145/3581784.3607035 2023
[25]

Baolin Li, Yankai Jiang, Vijay Gadepally, and Devesh Tiwari. 2024. Sprout: Green generative ai with carbon-efficient llm inference. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21799--21813

2024
[26]

Pengfei Li, Jianyi Yang, Mohammad A Islam, and Shaolei Ren. 2025. Making ai less' thirsty'. Communications of the ACM, 68(7):54--61

2025
[27]

Tianyang Liu, Canwen Xu, and Julian McAuley. 2024. https://arxiv.org/abs/2306.03091 Repobench: Benchmarking repository-level code auto-completion systems

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Gillian F Menzies, Seyhan Turan, and Philip FG Banfill. 2007. https://doi.org/10.1680/coma.2007.160.4.135 Life-cycle assessment and embodied energy: a review . Proceedings of the Institution of Civil Engineers-Construction Materials, 160(4):135--143

work page doi:10.1680/coma.2007.160.4.135 2007
[29]

Microsoft . 2026. https://datacenters.microsoft.com/sustainability/efficiency/ Microsoft datacenters: Efficiency . Microsoft Corporation. Accessed: 2026-05-22

2026
[30]

NVIDIA . 2020. NVIDIA A100 Tensor Core GPU Datasheet . https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf. Accessed: 2026-05-10

2020
[31]

NVIDIA . 2022 a . NVIDIA H100 Tensor Core GPU . https://www.nvidia.com/en-us/data-center/h100/. Accessed: 2026-05-10

2022
[32]

NVIDIA . 2022 b . NVIDIA L40 GPU for Data Center . https://www.nvidia.com/en-us/data-center/l40/. Accessed: 2026-05-10

2022
[33]

NVIDIA Corporation . 2025. https://developer.nvidia.com/management-library-nvml Nvidia management library (nvml)

2025
[34]

OpenAI. 2025. https://arxiv.org/abs/2508.10925 gpt-oss-120b & gpt-oss-20b model card . Preprint, arXiv:2508.10925

work page internal anchor Pith review Pith/arXiv arXiv 2025
[35]

OpenRouter . 2026. https://openrouter.ai/rankings OpenRouter rankings . OpenRouter. Accessed: 2026-05-22

2026
[36]

Sundar Pichai. 2026. https://blog.google/intl/en-nz/company-news/sundar-pichai-io-2026/#momentum Google I/O 2026 keynote . Google The Keyword Blog. Accessed: 2026-05-22

2026
[37]

Qwen Team . 2025. https://arxiv.org/abs/2505.09388 Qwen3 technical report . Preprint, arXiv:2505.09388

work page internal anchor Pith review Pith/arXiv arXiv 2025
[38]

Shaolei Ren, Bill Tomlinson, Rebecca W Black, and Andrew W Torrance. 2024. Reconciling the contrasting narratives on the environmental impact of large language models. Scientific Reports, 14(1):26310

2024
[39]

Sharegpt - share and save your conversations with ai

ShareGPT . Sharegpt - share and save your conversations with ai. https://sharegpt.com/
[40]

Tianyao Shi, Ritbik Kumar, Inez Hua, and Yi Ding. 2025. When servers meet species: A fab-to-grave lens on computing's biodiversity impact. ACM SIGENERGY Energy Informatics Review, 5(2):34--40

2025
[41]

SK hynix Inc. 2025. Sustainability reports & policies. https://www.skhynix.com/sustainability/UI-FR-SA1601/. Web page hosting SK hynix annual sustainability reports; accessed 19 May 2025

2025
[42]

Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and policy considerations for deep learning in nlp. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 3645--3650

2019
[43]

Taiwan Semiconductor Manufacturing Company Limited . 2025. Esg resources and documents. https://esg.tsmc.com/en/resources/documents.html. Web page, accessed 19 May 2025

2025
[44]

M-A-P Team, Xinrun Du, Yifan Yao, Kaijing Ma, Bingli Wang, Tianyu Zheng, Kang Zhu, Minghao Liu, Yiming Liang, Xiaolong Jin, Zhenlin Wei, Chujie Zheng, Kaixing Deng, Shuyue Guo, Shian Jia, Sichao Jiang, Yiyan Liao, Rui Li, Qinrui Li, and 76 others. 2025. https://arxiv.org/abs/2502.14739 Supergpqa: Scaling llm evaluation across 285 graduate disciplines . Pr...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

TechPowerUp . 2026. https://www.techpowerup.com/gpu-specs/h100-sxm5-80-gb.c3900 NVIDIA H100 SXM5 80 GB specs . TechPowerUp GPU Database. Accessed: 2026-05-17

2026
[46]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and 1 others. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288

work page internal anchor Pith review Pith/arXiv arXiv 2023
[47]

UNEP. 2024. https://www.unep.org Environmental impacts of e-waste: Heavy metal leaching and ecosystem disruption . UNEP Reports on E-Waste. E-waste releases heavy metals that contaminate soil and degrade freshwater and marine habitats

2024
[48]

United States Environmental Protection Agency (EPA) . 2025. Emissions & generation resource integrated database (egrid), egrid2023rev1. https://www.epa.gov/egrid. Accessed: 2025-05-19

2025
[49]

Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, and 1 others. 2024. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. Advances in Neural Information Processing Systems, 37:95266--95290

2024
[50]

Yanran Wu, Inez Hua, and Yi Ding. 2025 a . Not all water consumption is equal: A water stress weighted metric for sustainable computing. ACM SIGENERGY Energy Informatics Review, 5(2):84--90

2025
[51]

Yanran Wu, Inez Hua, and Yi Ding. 2025 b . Unveiling environmental impacts of large language model serving: A functional unit view. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10560--10576

2025
[52]

Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, and Yuntian Deng. 2024. https://openreview.net/forum?id=Bl8u7ZRlbM Wildchat: 1m chat GPT interaction logs in the wild . In The Twelfth International Conference on Learning Representations

2024

[1] [1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

[3] [3]

Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, and Juanzi Li. 2023. https://arxiv.org/abs/2308.14508 Longbench: A bilingual, multitask benchmark for long context understanding . Preprint, arXiv:2308.14508

work page internal anchor Pith review Pith/arXiv arXiv 2023

[4] [4]

Arlene Blum. 2024. https://www.forbes.com/sites/arleneblum/2024/12/17/chip-manufacturing-shortcuts-harm-our-health-and-environment/ Chip manufacturing shortcuts harm our health and environment . Forbes

2024

[5] [5]

Andreas Busa, Malcolm Hegeman, Jeff Vickers, Natalia Duque-Ciceri, and Constantin Herrmann. 2019. https://www.delltechnologies.com/asset/en-us/products/servers/technical-support/Full_LCA_Dell_R740.pdf Life cycle assessment of dell r740 . Technical report, Dell Technologies. Accessed 19 May 2025

2019

[6] [6]

Bradley J Cardinale, J Emmett Duffy, Andrew Gonzalez, David U Hooper, Charles Perrings, Patrick Venail, Anita Narwani, Georgina M Mace, David Tilman, David A Wardle, and 1 others. 2012. Biodiversity loss and its impact on humanity. Nature, 486(7401):59--67

2012

[7] [7]

Aaron Chatterji, Thomas Cunningham, David J Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman. 2025. How people use chatgpt. Technical report, National Bureau of Economic Research

2025

[8] [8]

Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. https://arxiv.org/pdf/2310.11248.pdf Crosscodeeval: A diverse and multilingual benchmark for cross-file code completion . In Thirty-seventh Conference on Neural Information Proces...

work page arXiv 2023

[9] [9]

Yi Ding and Tianyao Shi. 2024. Sustainable llm serving: Environmental implications, challenges, and opportunities. In 2024 IEEE 15th International Green and Sustainable Computing Conference (IGSC), pages 37--38. IEEE

2024

[10] [10]

European Commission Joint Research Centre (JRC) and Netherlands Environmental Assessment Agency . 2024. EDGAR\_2024\_GHG : Emissions database for global atmospheric research. https://edgar.jrc.ec.europa.eu/dataset_ghg2024. Accessed: 2025-05-19

2024

[11] [11]

Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Prateek Sharma, Fan Chen, and Lei Jiang. 2024. Llmcarbon: Modeling the end-to-end carbon footprint of large language models. In International Conference on Learning Representations, volume 2024, pages 24727--24741

2024

[12] [12]

Sophia Falk, David Ekchajzer, Thibault Pirson, Etienne Lees-Perasso, Augustin Wattiez, Lisa Biber-Freudenberger, Sasha Luccioni, and Aimee van Wynsberghe. 2025. More than carbon: Cradle-to-grave environmental impacts of genai training on the nvidia a100 gpu. arXiv preprint arXiv:2509.00093

work page arXiv 2025

[13] [13]

u rgen Kl \

Matthias Finkbeiner, Atsushi Inaba, Reginald Tan, Kim Christiansen, and Hans-J \"u rgen Kl \"u ppel. 2006. https://doi.org/10.1065/lca2006.02.002 The new international standards for life cycle assessment: Iso 14040 and iso 14044 . The international journal of life cycle assessment, 11:80--85

work page doi:10.1065/lca2006.02.002 2006

[14] [14]

Google DeepMind . 2026. Gemma 4 Model Card . https://ai.google.dev/gemma/docs/core/model_card_4. Accessed: 2026-05-14

2026

[15] [15]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[16] [16]

GreenDelta . 2026. https://nexus.openlca.org/database/ELCD openlca nexus: ELCD database . GreenDelta GmbH. Accessed: 2026-05-22

2026

[17] [17]

Pranjol Sen Gupta, Md Rajib Hossen, Pengfei Li, Shaolei Ren, and Mohammad A Islam. 2024. A dataset for research on water sustainability. In Proceedings of the 15th ACM International Conference on Future and Sustainable Energy Systems, pages 442--446

2024

[18] [18]

Wu, Act: designing sustainable computer systems with an architectural carbon modeling tool, in: ACM/IEEE 49th Annu

Udit Gupta, Mariam Elgamal, Gage Hills, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, and Carole-Jean Wu. 2022. https://doi.org/10.1145/3470496.3527408 ACT : Designing sustainable computer systems with an architectural carbon modeling tool . In ISCA

work page doi:10.1145/3470496.3527408 2022

[19] [19]

Huijbregts, Zoran J.N

Mark A.J. Huijbregts, Zoran J.N. Steinmann, Pieter M.F. Elshout, Geert Stam, Francesca Verones, Marisa Vieira, Anne Hollander, Michiel Zijp, and Rosalie van Zelm. 2016. https://www.rivm.nl/publicaties/recipe-2016-harmonized-life-cycle-impact-assessment-method-at-midpoint-and-endpoint-level ReCiPe 2016: A Harmonized Life Cycle Impact Assessment Method at M...

2016

[20] [20]

IPBES . 2019. https://doi.org/10.5281/zenodo.3553579 Summary for policymakers of the global assessment report on biodiversity and ecosystem services of the intergovernmental science-policy platform on biodiversity and ecosystem services . Technical report, IPBES secretariat, Bonn, Germany. S. D \'i az, J. Settele, E. S. Brond \'i zio, H. T. Ngo, M. Gu \`e...

work page doi:10.5281/zenodo.3553579 2019

[21] [21]

Scotten W Jones. 2023. https://doi.org/10.1109/IEDM45741.2023.10413715 Modeling 300mm wafer fab carbon emissions . In 2023 International Electron Devices Meeting (IEDM), pages 1--4. IEEE

work page doi:10.1109/iedm45741.2023.10413715 2023

[22] [22]

Walter Kl \"o pffer and Birgit Grahl. 2014. Life cycle assessment (LCA): a guide to best practice. John Wiley & Sons

2014

[23] [23]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles

2023

[24] [24]

Baolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, and Devesh Tiwari. 2023. https://doi.org/10.1145/3581784.3607035 Toward sustainable hpc: Carbon footprint estimation and environmental implications of hpc systems . In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages...

work page doi:10.1145/3581784.3607035 2023

[25] [25]

Baolin Li, Yankai Jiang, Vijay Gadepally, and Devesh Tiwari. 2024. Sprout: Green generative ai with carbon-efficient llm inference. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21799--21813

2024

[26] [26]

Pengfei Li, Jianyi Yang, Mohammad A Islam, and Shaolei Ren. 2025. Making ai less' thirsty'. Communications of the ACM, 68(7):54--61

2025

[27] [27]

Tianyang Liu, Canwen Xu, and Julian McAuley. 2024. https://arxiv.org/abs/2306.03091 Repobench: Benchmarking repository-level code auto-completion systems

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

Gillian F Menzies, Seyhan Turan, and Philip FG Banfill. 2007. https://doi.org/10.1680/coma.2007.160.4.135 Life-cycle assessment and embodied energy: a review . Proceedings of the Institution of Civil Engineers-Construction Materials, 160(4):135--143

work page doi:10.1680/coma.2007.160.4.135 2007

[29] [29]

Microsoft . 2026. https://datacenters.microsoft.com/sustainability/efficiency/ Microsoft datacenters: Efficiency . Microsoft Corporation. Accessed: 2026-05-22

2026

[30] [30]

NVIDIA . 2020. NVIDIA A100 Tensor Core GPU Datasheet . https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf. Accessed: 2026-05-10

2020

[31] [31]

NVIDIA . 2022 a . NVIDIA H100 Tensor Core GPU . https://www.nvidia.com/en-us/data-center/h100/. Accessed: 2026-05-10

2022

[32] [32]

NVIDIA . 2022 b . NVIDIA L40 GPU for Data Center . https://www.nvidia.com/en-us/data-center/l40/. Accessed: 2026-05-10

2022

[33] [33]

NVIDIA Corporation . 2025. https://developer.nvidia.com/management-library-nvml Nvidia management library (nvml)

2025

[34] [34]

OpenAI. 2025. https://arxiv.org/abs/2508.10925 gpt-oss-120b & gpt-oss-20b model card . Preprint, arXiv:2508.10925

work page internal anchor Pith review Pith/arXiv arXiv 2025

[35] [35]

OpenRouter . 2026. https://openrouter.ai/rankings OpenRouter rankings . OpenRouter. Accessed: 2026-05-22

2026

[36] [36]

Sundar Pichai. 2026. https://blog.google/intl/en-nz/company-news/sundar-pichai-io-2026/#momentum Google I/O 2026 keynote . Google The Keyword Blog. Accessed: 2026-05-22

2026

[37] [37]

Qwen Team . 2025. https://arxiv.org/abs/2505.09388 Qwen3 technical report . Preprint, arXiv:2505.09388

work page internal anchor Pith review Pith/arXiv arXiv 2025

[38] [38]

Shaolei Ren, Bill Tomlinson, Rebecca W Black, and Andrew W Torrance. 2024. Reconciling the contrasting narratives on the environmental impact of large language models. Scientific Reports, 14(1):26310

2024

[39] [39]

Sharegpt - share and save your conversations with ai

ShareGPT . Sharegpt - share and save your conversations with ai. https://sharegpt.com/

[40] [40]

Tianyao Shi, Ritbik Kumar, Inez Hua, and Yi Ding. 2025. When servers meet species: A fab-to-grave lens on computing's biodiversity impact. ACM SIGENERGY Energy Informatics Review, 5(2):34--40

2025

[41] [41]

SK hynix Inc. 2025. Sustainability reports & policies. https://www.skhynix.com/sustainability/UI-FR-SA1601/. Web page hosting SK hynix annual sustainability reports; accessed 19 May 2025

2025

[42] [42]

Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and policy considerations for deep learning in nlp. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 3645--3650

2019

[43] [43]

Taiwan Semiconductor Manufacturing Company Limited . 2025. Esg resources and documents. https://esg.tsmc.com/en/resources/documents.html. Web page, accessed 19 May 2025

2025

[44] [44]

M-A-P Team, Xinrun Du, Yifan Yao, Kaijing Ma, Bingli Wang, Tianyu Zheng, Kang Zhu, Minghao Liu, Yiming Liang, Xiaolong Jin, Zhenlin Wei, Chujie Zheng, Kaixing Deng, Shuyue Guo, Shian Jia, Sichao Jiang, Yiyan Liao, Rui Li, Qinrui Li, and 76 others. 2025. https://arxiv.org/abs/2502.14739 Supergpqa: Scaling llm evaluation across 285 graduate disciplines . Pr...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[45] [45]

TechPowerUp . 2026. https://www.techpowerup.com/gpu-specs/h100-sxm5-80-gb.c3900 NVIDIA H100 SXM5 80 GB specs . TechPowerUp GPU Database. Accessed: 2026-05-17

2026

[46] [46]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and 1 others. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288

work page internal anchor Pith review Pith/arXiv arXiv 2023

[47] [47]

UNEP. 2024. https://www.unep.org Environmental impacts of e-waste: Heavy metal leaching and ecosystem disruption . UNEP Reports on E-Waste. E-waste releases heavy metals that contaminate soil and degrade freshwater and marine habitats

2024

[48] [48]

United States Environmental Protection Agency (EPA) . 2025. Emissions & generation resource integrated database (egrid), egrid2023rev1. https://www.epa.gov/egrid. Accessed: 2025-05-19

2025

[49] [49]

Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, and 1 others. 2024. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. Advances in Neural Information Processing Systems, 37:95266--95290

2024

[50] [50]

Yanran Wu, Inez Hua, and Yi Ding. 2025 a . Not all water consumption is equal: A water stress weighted metric for sustainable computing. ACM SIGENERGY Energy Informatics Review, 5(2):84--90

2025

[51] [51]

Yanran Wu, Inez Hua, and Yi Ding. 2025 b . Unveiling environmental impacts of large language model serving: A functional unit view. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10560--10576

2025

[52] [52]

Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, and Yuntian Deng. 2024. https://openreview.net/forum?id=Bl8u7ZRlbM Wildchat: 1m chat GPT interaction logs in the wild . In The Twelfth International Conference on Learning Representations

2024