Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.
Adaptive distraction: Probing llm contextual robustness with automated tree search
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Guardian-as-an-Advisor prepends risk labels and explanations from a guardian model to queries, improving LLM safety compliance and reducing over-refusal while adding minimal compute overhead.
citing papers explorer
-
Seir\^enes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning
Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.
-
Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs
Guardian-as-an-Advisor prepends risk labels and explanations from a guardian model to queries, improving LLM safety compliance and reducing over-refusal while adding minimal compute overhead.