Tool-use agents suffer large accuracy drops from reward and transition perturbations but domain-randomized RL on static perturbations closes about 27% of the unseen transition gap while retaining most clean performance.
Cybersecurity For Beginners
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
MulDimIF introduces a multi-dimensional constraint framework and generation pipeline that reveals sharp performance drops in LLMs as instruction complexity rises and shows targeted training gains from attention module updates.
Identifies and classifies Mid-Session Tool Injection (MSTI) attacks on WebMCP into Tool Hijacking and Tool Framing, demonstrates disruption via implementation, and recommends mitigations like origin binding and logging.
Structured reflection makes error diagnosis and repair an explicit trainable step that improves reliability and reduces redundant calls in tool-using LLM agents.
citing papers explorer
-
Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions
Structured reflection makes error diagnosis and repair an explicit trainable step that improves reliability and reduces redundant calls in tool-using LLM agents.