The Twelfth International Conference on Learning Representations , year =

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

ContractBench: Can LLM Agents Preserve Observation Contracts?

cs.SE · 2026-05-17 · conditional · novelty 7.0

ContractBench shows that LLM agents frequently violate observation contracts by using expired artifacts or corrupting their byte integrity, with no model exceeding 80% success and notable scaling irregularities across families.

An Executable Benchmarking Suite for Tool-Using Agents

cs.SE · 2026-05-10 · unverdicted · novelty 5.0

The paper delivers a unified executable benchmarking suite for tool-using agents that enforces a shared evidence-admission contract across web, code, and micro-task environments.

citing papers explorer

Showing 2 of 2 citing papers.

ContractBench: Can LLM Agents Preserve Observation Contracts? cs.SE · 2026-05-17 · conditional · none · ref 17
ContractBench shows that LLM agents frequently violate observation contracts by using expired artifacts or corrupting their byte integrity, with no model exceeding 80% success and notable scaling irregularities across families.
An Executable Benchmarking Suite for Tool-Using Agents cs.SE · 2026-05-10 · unverdicted · none · ref 3
The paper delivers a unified executable benchmarking suite for tool-using agents that enforces a shared evidence-admission contract across web, code, and micro-task environments.

The Twelfth International Conference on Learning Representations , year =

fields

years

verdicts

representative citing papers

citing papers explorer