TRACER : Trajectory risk aggregation for critical episodes in agentic reasoning

Sina Tayebati, Divake Kumar, Nastaran Darabi, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi · 2026 · arXiv 2602.11409

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

VLM Judges Can Rank but Cannot Score: Task-Dependent Uncertainty in Multimodal Evaluation

cs.LG · 2026-04-28 · unverdicted · novelty 7.0

VLM judges exhibit task-dependent uncertainty in their scores, with conformal prediction revealing wide intervals for complex tasks and a decoupling between good ranking performance and poor absolute scoring reliability.

Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging

cs.SE · 2026-04-20 · unverdicted · novelty 7.0

Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.

citing papers explorer

Showing 2 of 2 citing papers.

VLM Judges Can Rank but Cannot Score: Task-Dependent Uncertainty in Multimodal Evaluation cs.LG · 2026-04-28 · unverdicted · none · ref 30
VLM judges exhibit task-dependent uncertainty in their scores, with conformal prediction revealing wide intervals for complex tasks and a decoupling between good ranking performance and poor absolute scoring reliability.
Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging cs.SE · 2026-04-20 · unverdicted · none · ref 23
Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.

TRACER : Trajectory risk aggregation for critical episodes in agentic reasoning

fields

years

verdicts

representative citing papers

citing papers explorer