Frontier LLMs miss dangerous actions in long coding agent transcripts 2-30 times more often after hundreds of thousands of benign tokens.
Advances in Neural Information Processing Systems , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AI 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
Non-reasoning LLMs fail the equivalence class problem while reasoning LLMs perform better but remain incomplete, with difficulty peaking at phase transition for the former and maximum diameter for the latter.
citing papers explorer
-
Classifier Context Rot: Monitor Performance Degrades with Context Length
Frontier LLMs miss dangerous actions in long coding agent transcripts 2-30 times more often after hundreds of thousands of benign tokens.
-
OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
-
How Well Do LLMs Perform on the Simplest Long-Chain Reasoning Tasks: An Empirical Study on the Equivalence Class Problem
Non-reasoning LLMs fail the equivalence class problem while reasoning LLMs perform better but remain incomplete, with difficulty peaking at phase transition for the former and maximum diameter for the latter.