Title resolution pending

Jaghouar, S · 2025 · arXiv 2505.07291

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Reward Hacking Benchmark: Measuring Exploits in LLM Agents with Tool Use

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

The Reward Hacking Benchmark shows RL post-training raises exploit rates in tool-using LLM agents from 0.6% to 13.9%, with environmental hardening cutting exploits by 87.7% relative without lowering task success.

PubSwap: Public-Data Off-Policy Coordination for Federated RLVR

cs.LG · 2026-04-14 · unverdicted · novelty 5.0

PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Reward Hacking Benchmark: Measuring Exploits in LLM Agents with Tool Use cs.LG · 2026-05-03 · unverdicted · none · ref 32
The Reward Hacking Benchmark shows RL post-training raises exploit rates in tool-using LLM agents from 0.6% to 13.9%, with environmental hardening cutting exploits by 87.7% relative without lowering task success.
PubSwap: Public-Data Off-Policy Coordination for Federated RLVR cs.LG · 2026-04-14 · unverdicted · none · ref 20
PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer