Code-QA-Bench uses an answer-first pipeline and three-condition experiments to generate 628 tasks across 10 Python repositories and quantify that code access drives most performance gains while documentation adds only modest benefit on doc-dependent tasks.
SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Un- derstanding.arXiv preprint arXiv:2603.16124, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Code-QA-Bench: Separating Code Reasoning from Documentation Memorization in Repository-Level QA
Code-QA-Bench uses an answer-first pipeline and three-condition experiments to generate 628 tasks across 10 Python repositories and quantify that code access drives most performance gains while documentation adds only modest benefit on doc-dependent tasks.