IntentTester migrates tests across libraries using TDL abstraction and multi-agent LLM synthesis, achieving 85% correctness and 74% effectiveness versus 51% and 43% for baselines on nine projects in JSON, HTML, and Time domains.
10"; L2 let total = count * 2; L3 let avg = total / 4; L4 println!(
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Decoding Time Verification (DTV) interleaves verifier calls at structural boundaries during autoregressive code generation for C-to-Rust and JavaScript-to-TypeScript translation, raising pass rates while using fewer tokens than post-hoc baselines.
A new bias-aware benchmark for duplicate bug report detection shows simpler techniques outperform recent sophisticated methods on most projects and match industry tools.
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.
citing papers explorer
-
Large Language Model-Brained GUI Agents: A Survey
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.