Leyline adds a policy-directed KV cache edit primitive with closed-form RoPE correction for agentic inference, reporting +11.2 pp cache-hit lift and +14.3 pp solve-rate gain.
debug-gym: A text-based environment for interactive debugging
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
ADI equips AI debugging agents with function-level interaction via a new execution trace structure, raising SWE-bench Verified resolution to 63.8% at $1.28 per task and delivering 6-18% gains when added to existing agents.
Debug2Fix integrates interactive debugging via subagents into coding agents, delivering >20% gains on GitBug-Java and SWE-Bench-Live while enabling weaker models to match stronger ones.
A controlled user study with 24 programmers shows sketch-based pen input can handle breakpoint setting, step execution, and state inspection in debugging, though precision, recognition, and recall remain challenges.
citing papers explorer
-
Leyline: KV Cache Directives for Agentic Inference
Leyline adds a policy-directed KV cache edit primitive with closed-form RoPE correction for agentic inference, reporting +11.2 pp cache-hit lift and +14.3 pp solve-rate gain.
-
Empowering Autonomous Debugging Agents with Efficient Dynamic Analysis
ADI equips AI debugging agents with function-level interaction via a new execution trace structure, raising SWE-bench Verified resolution to 63.8% at $1.28 per task and delivering 6-18% gains when added to existing agents.
-
Debug2Fix: Can Interactive Debugging Help Coding Agents Fix More Bugs?
Debug2Fix integrates interactive debugging via subagents into coding agents, delivering >20% gains on GitBug-Java and SWE-Bench-Live while enabling weaker models to match stronger ones.
-
Sketch Bug: Using Sketch-Based Input for Interactive Code Debugging
A controlled user study with 24 programmers shows sketch-based pen input can handle breakpoint setting, step execution, and state inspection in debugging, though precision, recognition, and recall remain challenges.