BIRDS: Characterizing and Understanding Biodiversity Impact of Large Language Model Serving
Pith reviewed 2026-06-29 14:35 UTC · model grok-4.3
The pith
Biodiversity impact from large language model serving accumulates at scale and can be balanced against response quality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BIRDS defines request-level functional units, quantifies operational and embodied biodiversity impact of LLM serving, and introduces Quality-Normalized Biodiversity Impact to jointly analyze ecological impact and response quality. Across diverse workloads, models, GPUs, and regions, biodiversity impact accumulates at scale and exposes actionable quality-aware serving tradeoffs.
What carries the argument
The BIRDS framework, which uses request-level functional units to track operational and embodied biodiversity pathways, together with the Quality-Normalized Biodiversity Impact metric that normalizes those impacts by response quality.
If this is right
- Operators can select serving configurations that lower biodiversity cost for a given level of output quality.
- Different combinations of models and hardware produce measurably different biodiversity footprints that can be ranked.
- Per-request impacts, though small, sum to large ecosystem effects when request volume grows.
- Quality-aware decisions allow tradeoffs that reduce ecological load without sacrificing usable answers.
Where Pith is reading between the lines
- Data-center siting decisions could incorporate biodiversity maps to avoid high-impact regions.
- The same request-level accounting might apply to other large-scale computing workloads beyond language models.
- Platforms could expose the QNBI value to users or regulators as an additional performance signal.
- Long-term monitoring of real ecosystems near serving facilities would test whether the modeled pathways match observed outcomes.
Load-bearing premise
Biodiversity impact can be accurately quantified at the request level using defined functional units for operational and embodied pathways without major unaccounted measurement errors or biases.
What would settle it
An independent measurement campaign at LLM data centers that records actual changes in local species populations or habitat quality and compares the totals to the summed per-request estimates produced by BIRDS.
Figures
read the original abstract
Large language model (LLM) serving creates environmental impacts beyond carbon and water, including ecosystem damage through biodiversity-related pathways. We present BIRDS, a framework for Biodiversity Impact of Request-Driven LLM Serving. BIRDS defines request-level functional units, quantifies operational and embodied biodiversity impact, and introduces Quality-Normalized Biodiversity Impact (QNBI) to jointly analyze ecological impact and response quality. Across diverse workloads, models, GPUs, and regions, BIRDS reveals that biodiversity impact accumulates at scale and exposes actionable quality-aware serving tradeoffs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents the BIRDS framework for quantifying biodiversity impact of request-driven LLM serving. It defines request-level functional units, measures operational and embodied pathways, introduces the Quality-Normalized Biodiversity Impact (QNBI) metric to jointly consider impact and response quality, and applies the framework across workloads, models, GPUs, and regions to conclude that biodiversity impact accumulates at scale while exposing quality-aware serving tradeoffs.
Significance. If the request-level quantifications hold, the work is significant for extending LLM environmental assessments beyond carbon and water to biodiversity pathways. The QNBI metric is a novel contribution enabling joint analysis of ecological cost and quality, and the multi-setting evaluation provides concrete evidence of scale effects and tradeoffs that could guide sustainable serving policies.
major comments (1)
- [QNBI definition] QNBI definition: the quality normalization parameters are free parameters; without sensitivity analysis showing that the reported quality-aware tradeoffs remain stable under reasonable perturbations of these parameters, the central claim that BIRDS exposes actionable tradeoffs is not yet load-bearing.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on the QNBI metric. We address the point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [QNBI definition] QNBI definition: the quality normalization parameters are free parameters; without sensitivity analysis showing that the reported quality-aware tradeoffs remain stable under reasonable perturbations of these parameters, the central claim that BIRDS exposes actionable tradeoffs is not yet load-bearing.
Authors: We agree that the quality normalization parameters in the QNBI definition are tunable and that explicit sensitivity analysis is required to substantiate the robustness of the quality-aware tradeoffs. In the revised manuscript we will add a dedicated sensitivity analysis subsection. This analysis will perturb the normalization parameters over ranges consistent with observed variability in the underlying quality metrics (e.g., factors from 0.5× to 2.0× the baseline values) and will demonstrate that the relative ordering of serving configurations and the identification of actionable tradeoffs remain stable. The new subsection will be placed after the main QNBI results and will directly support the central claims. revision: yes
Circularity Check
No significant circularity
full rationale
The paper introduces BIRDS as a new framework that defines request-level functional units, quantifies operational and embodied biodiversity impacts via independent pathways, and defines QNBI for joint analysis. No equations, derivations, or self-citations are presented that reduce any claimed result to a fitted input or prior self-referential definition by construction. The quantification approach is presented as an external measurement method without internal circular reduction, making the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Quality normalization parameters in QNBI
axioms (1)
- domain assumption Biodiversity impact can be quantified using request-level functional units for operational and embodied effects
invented entities (1)
-
Quality-Normalized Biodiversity Impact (QNBI)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
online" 'onlinestring :=
ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
-
[2]
write newline
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
-
[3]
Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, and Juanzi Li. 2023. https://arxiv.org/abs/2308.14508 Longbench: A bilingual, multitask benchmark for long context understanding . Preprint, arXiv:2308.14508
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[4]
Arlene Blum. 2024. https://www.forbes.com/sites/arleneblum/2024/12/17/chip-manufacturing-shortcuts-harm-our-health-and-environment/ Chip manufacturing shortcuts harm our health and environment . Forbes
2024
-
[5]
Andreas Busa, Malcolm Hegeman, Jeff Vickers, Natalia Duque-Ciceri, and Constantin Herrmann. 2019. https://www.delltechnologies.com/asset/en-us/products/servers/technical-support/Full_LCA_Dell_R740.pdf Life cycle assessment of dell r740 . Technical report, Dell Technologies. Accessed 19 May 2025
2019
-
[6]
Bradley J Cardinale, J Emmett Duffy, Andrew Gonzalez, David U Hooper, Charles Perrings, Patrick Venail, Anita Narwani, Georgina M Mace, David Tilman, David A Wardle, and 1 others. 2012. Biodiversity loss and its impact on humanity. Nature, 486(7401):59--67
2012
-
[7]
Aaron Chatterji, Thomas Cunningham, David J Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman. 2025. How people use chatgpt. Technical report, National Bureau of Economic Research
2025
-
[8]
Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. https://arxiv.org/pdf/2310.11248.pdf Crosscodeeval: A diverse and multilingual benchmark for cross-file code completion . In Thirty-seventh Conference on Neural Information Proces...
-
[9]
Yi Ding and Tianyao Shi. 2024. Sustainable llm serving: Environmental implications, challenges, and opportunities. In 2024 IEEE 15th International Green and Sustainable Computing Conference (IGSC), pages 37--38. IEEE
2024
-
[10]
European Commission Joint Research Centre (JRC) and Netherlands Environmental Assessment Agency . 2024. EDGAR\_2024\_GHG : Emissions database for global atmospheric research. https://edgar.jrc.ec.europa.eu/dataset_ghg2024. Accessed: 2025-05-19
2024
-
[11]
Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Prateek Sharma, Fan Chen, and Lei Jiang. 2024. Llmcarbon: Modeling the end-to-end carbon footprint of large language models. In International Conference on Learning Representations, volume 2024, pages 24727--24741
2024
-
[12]
Sophia Falk, David Ekchajzer, Thibault Pirson, Etienne Lees-Perasso, Augustin Wattiez, Lisa Biber-Freudenberger, Sasha Luccioni, and Aimee van Wynsberghe. 2025. More than carbon: Cradle-to-grave environmental impacts of genai training on the nvidia a100 gpu. arXiv preprint arXiv:2509.00093
-
[13]
Matthias Finkbeiner, Atsushi Inaba, Reginald Tan, Kim Christiansen, and Hans-J \"u rgen Kl \"u ppel. 2006. https://doi.org/10.1065/lca2006.02.002 The new international standards for life cycle assessment: Iso 14040 and iso 14044 . The international journal of life cycle assessment, 11:80--85
-
[14]
Google DeepMind . 2026. Gemma 4 Model Card . https://ai.google.dev/gemma/docs/core/model_card_4. Accessed: 2026-05-14
2026
-
[15]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[16]
GreenDelta . 2026. https://nexus.openlca.org/database/ELCD openlca nexus: ELCD database . GreenDelta GmbH. Accessed: 2026-05-22
2026
-
[17]
Pranjol Sen Gupta, Md Rajib Hossen, Pengfei Li, Shaolei Ren, and Mohammad A Islam. 2024. A dataset for research on water sustainability. In Proceedings of the 15th ACM International Conference on Future and Sustainable Energy Systems, pages 442--446
2024
-
[18]
Udit Gupta, Mariam Elgamal, Gage Hills, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, and Carole-Jean Wu. 2022. https://doi.org/10.1145/3470496.3527408 ACT : Designing sustainable computer systems with an architectural carbon modeling tool . In ISCA
-
[19]
Huijbregts, Zoran J.N
Mark A.J. Huijbregts, Zoran J.N. Steinmann, Pieter M.F. Elshout, Geert Stam, Francesca Verones, Marisa Vieira, Anne Hollander, Michiel Zijp, and Rosalie van Zelm. 2016. https://www.rivm.nl/publicaties/recipe-2016-harmonized-life-cycle-impact-assessment-method-at-midpoint-and-endpoint-level ReCiPe 2016: A Harmonized Life Cycle Impact Assessment Method at M...
2016
-
[20]
IPBES . 2019. https://doi.org/10.5281/zenodo.3553579 Summary for policymakers of the global assessment report on biodiversity and ecosystem services of the intergovernmental science-policy platform on biodiversity and ecosystem services . Technical report, IPBES secretariat, Bonn, Germany. S. D \'i az, J. Settele, E. S. Brond \'i zio, H. T. Ngo, M. Gu \`e...
-
[21]
Scotten W Jones. 2023. https://doi.org/10.1109/IEDM45741.2023.10413715 Modeling 300mm wafer fab carbon emissions . In 2023 International Electron Devices Meeting (IEDM), pages 1--4. IEEE
-
[22]
Walter Kl \"o pffer and Birgit Grahl. 2014. Life cycle assessment (LCA): a guide to best practice. John Wiley & Sons
2014
-
[23]
Gonzalez, Hao Zhang, and Ion Stoica
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles
2023
-
[24]
Baolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, and Devesh Tiwari. 2023. https://doi.org/10.1145/3581784.3607035 Toward sustainable hpc: Carbon footprint estimation and environmental implications of hpc systems . In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages...
-
[25]
Baolin Li, Yankai Jiang, Vijay Gadepally, and Devesh Tiwari. 2024. Sprout: Green generative ai with carbon-efficient llm inference. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21799--21813
2024
-
[26]
Pengfei Li, Jianyi Yang, Mohammad A Islam, and Shaolei Ren. 2025. Making ai less' thirsty'. Communications of the ACM, 68(7):54--61
2025
-
[27]
Tianyang Liu, Canwen Xu, and Julian McAuley. 2024. https://arxiv.org/abs/2306.03091 Repobench: Benchmarking repository-level code auto-completion systems
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[28]
Gillian F Menzies, Seyhan Turan, and Philip FG Banfill. 2007. https://doi.org/10.1680/coma.2007.160.4.135 Life-cycle assessment and embodied energy: a review . Proceedings of the Institution of Civil Engineers-Construction Materials, 160(4):135--143
-
[29]
Microsoft . 2026. https://datacenters.microsoft.com/sustainability/efficiency/ Microsoft datacenters: Efficiency . Microsoft Corporation. Accessed: 2026-05-22
2026
-
[30]
NVIDIA . 2020. NVIDIA A100 Tensor Core GPU Datasheet . https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf. Accessed: 2026-05-10
2020
-
[31]
NVIDIA . 2022 a . NVIDIA H100 Tensor Core GPU . https://www.nvidia.com/en-us/data-center/h100/. Accessed: 2026-05-10
2022
-
[32]
NVIDIA . 2022 b . NVIDIA L40 GPU for Data Center . https://www.nvidia.com/en-us/data-center/l40/. Accessed: 2026-05-10
2022
-
[33]
NVIDIA Corporation . 2025. https://developer.nvidia.com/management-library-nvml Nvidia management library (nvml)
2025
-
[34]
OpenAI. 2025. https://arxiv.org/abs/2508.10925 gpt-oss-120b & gpt-oss-20b model card . Preprint, arXiv:2508.10925
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[35]
OpenRouter . 2026. https://openrouter.ai/rankings OpenRouter rankings . OpenRouter. Accessed: 2026-05-22
2026
-
[36]
Sundar Pichai. 2026. https://blog.google/intl/en-nz/company-news/sundar-pichai-io-2026/#momentum Google I/O 2026 keynote . Google The Keyword Blog. Accessed: 2026-05-22
2026
-
[37]
Qwen Team . 2025. https://arxiv.org/abs/2505.09388 Qwen3 technical report . Preprint, arXiv:2505.09388
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
Shaolei Ren, Bill Tomlinson, Rebecca W Black, and Andrew W Torrance. 2024. Reconciling the contrasting narratives on the environmental impact of large language models. Scientific Reports, 14(1):26310
2024
-
[39]
Sharegpt - share and save your conversations with ai
ShareGPT . Sharegpt - share and save your conversations with ai. https://sharegpt.com/
-
[40]
Tianyao Shi, Ritbik Kumar, Inez Hua, and Yi Ding. 2025. When servers meet species: A fab-to-grave lens on computing's biodiversity impact. ACM SIGENERGY Energy Informatics Review, 5(2):34--40
2025
-
[41]
SK hynix Inc. 2025. Sustainability reports & policies. https://www.skhynix.com/sustainability/UI-FR-SA1601/. Web page hosting SK hynix annual sustainability reports; accessed 19 May 2025
2025
-
[42]
Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and policy considerations for deep learning in nlp. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 3645--3650
2019
-
[43]
Taiwan Semiconductor Manufacturing Company Limited . 2025. Esg resources and documents. https://esg.tsmc.com/en/resources/documents.html. Web page, accessed 19 May 2025
2025
-
[44]
M-A-P Team, Xinrun Du, Yifan Yao, Kaijing Ma, Bingli Wang, Tianyu Zheng, Kang Zhu, Minghao Liu, Yiming Liang, Xiaolong Jin, Zhenlin Wei, Chujie Zheng, Kaixing Deng, Shuyue Guo, Shian Jia, Sichao Jiang, Yiyan Liao, Rui Li, Qinrui Li, and 76 others. 2025. https://arxiv.org/abs/2502.14739 Supergpqa: Scaling llm evaluation across 285 graduate disciplines . Pr...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[45]
TechPowerUp . 2026. https://www.techpowerup.com/gpu-specs/h100-sxm5-80-gb.c3900 NVIDIA H100 SXM5 80 GB specs . TechPowerUp GPU Database. Accessed: 2026-05-17
2026
-
[46]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and 1 others. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[47]
UNEP. 2024. https://www.unep.org Environmental impacts of e-waste: Heavy metal leaching and ecosystem disruption . UNEP Reports on E-Waste. E-waste releases heavy metals that contaminate soil and degrade freshwater and marine habitats
2024
-
[48]
United States Environmental Protection Agency (EPA) . 2025. Emissions & generation resource integrated database (egrid), egrid2023rev1. https://www.epa.gov/egrid. Accessed: 2025-05-19
2025
-
[49]
Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, and 1 others. 2024. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. Advances in Neural Information Processing Systems, 37:95266--95290
2024
-
[50]
Yanran Wu, Inez Hua, and Yi Ding. 2025 a . Not all water consumption is equal: A water stress weighted metric for sustainable computing. ACM SIGENERGY Energy Informatics Review, 5(2):84--90
2025
-
[51]
Yanran Wu, Inez Hua, and Yi Ding. 2025 b . Unveiling environmental impacts of large language model serving: A functional unit view. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10560--10576
2025
-
[52]
Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, and Yuntian Deng. 2024. https://openreview.net/forum?id=Bl8u7ZRlbM Wildchat: 1m chat GPT interaction logs in the wild . In The Twelfth International Conference on Learning Representations
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.