A systematic review of 50 studies identifies 69 LLM-assisted tasks in empirical software engineering, concentrated in data processing and analysis with gaps in human-centered integration and reproducibility reporting.
Ohlsson, Björn Regnell, and Anders Wesslén
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
Transportability methods can transport causal effects from experimental samples to broader target populations in software engineering by leveraging observational data to improve external validity.
CoCoMagic applies constrained cooperative co-evolution to metamorphic and differential testing to find up to 287% more distinct behavioral divergences in an end-to-end ADS than baseline search methods.
Coding agents build libraries that pass a hidden 222-test oracle by producing minimal demos holding only the tested behavior, while leaving the requested functionality unfinished or absent when the oracle is unavailable.
A systematic comparison of eight Kubernetes hardening guidelines and ten scanners reveals substantial disparities in issue coverage and inconsistencies in scoring and ranking.
Temporal community detection applied to six releases of the train-ticket microservice benchmark reveals a stable two-community structure aligned with business processes, plus some multi-community services.
citing papers explorer
-
LLM-Assisted Empirical Software Engineering: Systematic Literature Review and Research Agenda
A systematic review of 50 studies identifies 69 LLM-assisted tasks in empirical software engineering, concentrated in data processing and analysis with gaps in human-centered integration and reproducibility reporting.
-
Guidelines for Empirical Studies in Software Engineering involving Large Language Models
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
-
Towards Improving the External Validity of Software Engineering Experiments with Transportability Methods
Transportability methods can transport causal effects from experimental samples to broader target populations in software engineering by leveraging observational data to improve external validity.
-
Constrained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach
CoCoMagic applies constrained cooperative co-evolution to metamorphic and differential testing to find up to 287% more distinct behavioral divergences in an end-to-end ADS than baseline search methods.
-
Building to the Test: Coding Agents Deliver What You Check, Not What You Requested
Coding agents build libraries that pass a hidden 222-test oracle by producing minimal demos holding only the tested behavior, while leaving the requested functionality unfinished or absent when the oracle is unavailable.
-
A Comparison of Kubernetes Compliance Standards and Configuration Scanners
A systematic comparison of eight Kubernetes hardening guidelines and ten scanners reveals substantial disparities in issue coverage and inconsistencies in scoring and ranking.
-
Analyzing the Evolution of Structural Communities within Microservice Architecture
Temporal community detection applied to six releases of the train-ticket microservice benchmark reveals a stable two-community structure aligned with business processes, plus some multi-community services.