MultiLogBench shows that LLM performance on automated logging varies substantially across programming languages, demonstrating that single-language evidence is insufficient for general claims about model behavior or tool design.
Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, June 23, 75–86
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
AIReSim is a discrete event simulator for evaluating failure mitigation, recovery, and capacity planning decisions in large AI clusters.
The paper reduces a broad set of prompt engineering techniques to six core approaches and applies them to life sciences use cases while addressing common LLM pitfalls.
citing papers explorer
-
Single-Language Evidence Is Insufficient for Automated Logging: A Multilingual Benchmark and Empirical Study with LLMs
MultiLogBench shows that LLM performance on automated logging varies substantially across programming languages, demonstrating that single-language evidence is insufficient for general claims about model behavior or tool design.
-
AIReSim: A Discrete Event Simulator for Large-scale AI Cluster Reliability Modeling
AIReSim is a discrete event simulator for evaluating failure mitigation, recovery, and capacity planning decisions in large AI clusters.
-
The Prompt Engineering Report Distilled: Quick Start Guide for Life Sciences
The paper reduces a broad set of prompt engineering techniques to six core approaches and applies them to life sciences use cases while addressing common LLM pitfalls.