Veri-r1: Toward precise and faithful claim verification via online reinforcement learning.arXiv preprint arXiv:2510.01932,

Qi He, Cheng Qian, Xiusi Chen, Bingxiang He, Yi R Fung, Heng Ji · arXiv 2510.01932

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Reinforcement Learning from Denoising Feedback

cs.CL · 2026-05-25 · unverdicted · novelty 5.0

RLDF is a new RL paradigm for diffusion language models that optimizes toward clipped clean states with weighted timestep sampling and reports substantial gains on reasoning benchmarks for LLaDA and Dream.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Reinforcement Learning from Denoising Feedback cs.CL · 2026-05-25 · unverdicted · none · ref 9
RLDF is a new RL paradigm for diffusion language models that optimizes toward clipped clean states with weighted timestep sampling and reports substantial gains on reasoning benchmarks for LLaDA and Dream.

Veri-r1: Toward precise and faithful claim verification via online reinforcement learning.arXiv preprint arXiv:2510.01932,

fields

years

verdicts

representative citing papers

citing papers explorer