TDDev automates the full TDD loop for web app generation from requirements, delivering 34-48 percentage point quality gains and zero manual intervention in user studies.
hub Canonical reference
Test-driven development and llm-based code generation,
Canonical reference. 100% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
SkillOps maintains LLM skill libraries via Skill Contracts and ecosystem graphs, raising ALFWorld task success to 79.5% as a standalone agent and improving retrieval baselines by up to 2.9 points with near-zero library-time LLM cost.
LLM2Ltac mines symbolic tactics from 11,725 Coq theorems using LLMs and integrates them into CoqHammer, improving proof rates by 23.87% on 6,199 theorems from four large verification projects.
A systematic mapping study of 248 papers introduces a taxonomy of synergistic effects, inter-analysis workflows, and mapping functions to catalog patterns in combined program analysis techniques.
LORIS detects local reasoning errors in LLM-generated proofs for loop invariants by translating natural-language steps to first-order logic implications and using invalid implications to refine the invariants, achieving 93.1% success on 460 C programs.
Babbling Suppression stops LLM code generation upon test passage to reduce token output and energy consumption by up to 65% across Python and Java benchmarks.
Multi-stage LLM training plus compiler-guided error repair boosts functional equivalence in Java-to-Cangjie translation by 6.06% over prior methods despite scarce parallel data.
Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.
Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.
A research roadmap analyzing the current state of search-based software engineering with foundation models, outlining challenges and directions across three integration aspects.
citing papers explorer
-
From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements
TDDev automates the full TDD loop for web app generation from requirements, delivering 34-48 percentage point quality gains and zero manual intervention in user studies.
-
SkillOps: Managing LLM Agent Skill Libraries as Self-Maintaining Software Ecosystems
SkillOps maintains LLM skill libraries via Skill Contracts and ecosystem graphs, raising ALFWorld task success to 79.5% as a standalone agent and improving retrieval baselines by up to 2.9 points with near-zero library-time LLM cost.
-
A Learning Method for Symbolic Systems Using Large Language Models
LLM2Ltac mines symbolic tactics from 11,725 Coq theorems using LLMs and integrates them into CoqHammer, improving proof rates by 23.87% on 6,199 theorems from four large verification projects.
-
Combined Program Analysis Techniques: A Systematic Mapping Study
A systematic mapping study of 248 papers introduces a taxonomy of synergistic effects, inter-analysis workflows, and mapping functions to catalog patterns in combined program analysis techniques.
-
Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors
LORIS detects local reasoning errors in LLM-generated proofs for loop invariants by translating natural-language steps to first-order logic implications and using invalid implications to refine the invariants, achieving 93.1% success on 460 C programs.
-
Babbling Suppression: Making LLMs Greener One Token at a Time
Babbling Suppression stops LLM code generation upon test passage to reduce token output and energy consumption by up to 65% across Python and Java benchmarks.
-
Boosting Automatic Java-to-Cangjie Translation with Multi-Stage LLM Training and Error Repair
Multi-stage LLM training plus compiler-guided error repair boosts functional equivalence in Java-to-Cangjie translation by 6.06% over prior methods despite scarce parallel data.
-
Walking the Tightrope of LLMs for Software Development: A Practitioners' Perspective
Qualitative interview study with 22 practitioners identifies multi-level benefits, challenges, and mitigation strategies for using LLMs in software development.
-
Evaluating LLM-Generated ACSL Annotations for Formal Verification
Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.
-
Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap
A research roadmap analyzing the current state of search-based software engineering with foundation models, outlining challenges and directions across three integration aspects.