LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Reconstructing 6946 syzbot bug-fix lifecycles reveals that accepted kernel patches are non-local and reviewer-constrained, enabling PatchAdvisor to improve automated repair quality over baselines via retrieval and diagnostic guidance.
By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.
citing papers explorer
-
Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
-
Beyond Crash-to-Patch: Patch Evolution for Linux Kernel Repair
Reconstructing 6946 syzbot bug-fix lifecycles reveals that accepted kernel patches are non-local and reviewer-constrained, enabling PatchAdvisor to improve automated repair quality over baselines via retrieval and diagnostic guidance.
-
TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning
By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.