AMARIS augments rubric updates in RL for LLMs with a persistent memory of rollout analyses and prior edits, yielding gains such as +2.8 points on GPQA-Diamond over local-adaptive baselines.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it