MR-Coupler leverages functional coupling analysis and LLMs to generate valid metamorphic test cases for over 90% of tasks while detecting 44% of real bugs, outperforming baselines by 64.90% in validity and 36.56% in false-alarm reduction.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 5years
2026 5representative citing papers
ALADDIN is a user-requirement-driven GUI test generation framework that incrementally navigates mobile app UIs and builds LLM-guided oracles to validate both correct and faulty user-requested functionalities across six apps.
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
CAT improves line coverage by 18% and branch coverage by 22% over prior LLM test generation methods by adding call-chain and dependency context from static analysis to prompts.
By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.
citing papers explorer
-
MR-Coupler: Automated Metamorphic Test Generation via Functional Coupling Analysis
MR-Coupler leverages functional coupling analysis and LLMs to generate valid metamorphic test cases for over 90% of tasks while detecting 44% of real bugs, outperforming baselines by 64.90% in validity and 36.56% in false-alarm reduction.
-
Automated Functional Testing for Malleable Mobile Application Driven from User Intent
ALADDIN is a user-requirement-driven GUI test generation framework that incrementally navigates mobile app UIs and builds LLM-guided oracles to validate both correct and faulty user-requested functionalities across six apps.
-
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
-
Call-Chain-Aware LLM-Based Test Generation for Java Projects
CAT improves line coverage by 18% and branch coverage by 22% over prior LLM test generation methods by adding call-chain and dependency context from static analysis to prompts.
-
TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning
By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.