pith. sign in

ACM transactions on intelligent systems and technology , volume=

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

years

2026 8

representative citing papers

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents

cs.AI · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

ClawForge is a generator framework that creates reproducible executable benchmarks for command-line agents under state conflict, with ClawForge-Bench showing frontier models reach at most 45.3% strict accuracy and that state inspection drives most performance gaps.

citing papers explorer

Showing 8 of 8 citing papers.