GGRO monitors token entropy to trigger gradient-guided token injection from reward models, improving LLM alignment on safety, helpfulness, and reasoning tasks at inference time.
arXiv preprint arXiv:2409.05923 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Gradient-Guided Reward Optimization for Inference-time Alignment
GGRO monitors token entropy to trigger gradient-guided token injection from reward models, improving LLM alignment on safety, helpfulness, and reasoning tasks at inference time.