Scientific amnesia is observable in production-like continual DPO pipelines, with most tested strategy proposers degrading in peak performance and results depending sharply on chain regime, evaluator, and seed coverage.
arXiv preprint arXiv:2406.06391 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Repeated post-training is not Self-improving: Diagnosing Scientific Amnesia in Continual DPO Pipelines
Scientific amnesia is observable in production-like continual DPO pipelines, with most tested strategy proposers degrading in peak performance and results depending sharply on chain regime, evaluator, and seed coverage.