ScarfBench supplies 34 Java applications yielding 204 directed cross-framework refactoring tasks and shows state-of-the-art agents achieve only 15.3% test pass on focused migrations and 12.2% on whole applications.
20 E.4 Runtime Configuration Table 12 separates the model declared in agent.toml from the model string explicitly passed by run.sh
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ScarfBench: A Benchmark for Cross-Framework Application Migration in Enterprise Java
ScarfBench supplies 34 Java applications yielding 204 directed cross-framework refactoring tasks and shows state-of-the-art agents achieve only 15.3% test pass on focused migrations and 12.2% on whole applications.