hub

arXiv preprint arXiv:2305.17493

The curse of recursion: Training on generated data makes models forget · 2025 · arXiv 2305.17493

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

Evaluating Very Long-Term Conversational Memory of LLM Agents

cs.CL · 2024-02-27 · unverdicted · novelty 8.0

Creates LoCoMo benchmark dataset for very long-term LLM conversational memory and shows current models struggle with lengthy dialogues and long-range temporal dynamics.

Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Recursive generative retraining with pluralistic preferences converges to a stable diverse distribution that satisfies a weighted Nash bargaining solution.

Perturbation Dose Responses in Recursive LLM Loops: Raw Switching, Stochastic Floors, and Persistent Escape under Append, Replace, and Dialog Updates

cs.AI · 2026-05-04 · unverdicted · novelty 7.0

In 30-step recursive LLM loops, append-mode persistent escape from source basins reaches 50% near 400 tokens under full history but plateaus below 50% under tail-clip memory policy, while replace-mode switching largely reflects state reset.

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

cs.CR · 2026-04-13 · unverdicted · novelty 7.0

RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.

Annotations Mitigate Post-Training Mode Collapse

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.

Stabilizing Unsupervised Self-Evolution of MLLMs via Continuous Softened Retracing reSampling

cs.CV · 2026-04-04 · unverdicted · novelty 6.0

CSRS improves MLLM self-evolution stability by using retracing mechanisms and softened continuous rewards instead of majority voting, reaching SOTA on geometric reasoning benchmarks like MathVision.

Reinforced Self-Training (ReST) for Language Modeling

cs.CL · 2023-08-17 · unverdicted · novelty 6.0

ReST improves LLM translation quality on benchmarks via offline RL on self-generated data, achieving gains in a compute-efficient way compared to typical RLHF.

Textbooks Are All You Need

cs.CL · 2023-06-20 · unverdicted · novelty 6.0

A 1.3B-parameter code model trained on 7B tokens of curated textbook and synthetic data achieves 50.6% on HumanEval, indicating data quality can enable strong performance at small scale.

AgentSim: A Platform for Verifiable Agent-Trace Simulation

cs.IR · 2026-04-29 · unverdicted · novelty 5.0

AgentSim creates and releases the Agent-Trace Corpus of over 103,000 verifiable reasoning steps across three IR benchmarks with claimed 100% grounding on substantive answers.

Position: No Retroactive Cure for Infringement during Training

cs.CR · 2026-04-20 · unverdicted · novelty 5.0

Post-hoc mitigation cannot retroactively cure infringement that occurred during unauthorized data ingestion and training because liability attaches to data lineage and retained expressive value in model weights.

citing papers explorer

Showing 10 of 10 citing papers.

Evaluating Very Long-Term Conversational Memory of LLM Agents cs.CL · 2024-02-27 · unverdicted · none · ref 41
Creates LoCoMo benchmark dataset for very long-term LLM conversational memory and shows current models struggle with lengthy dialogues and long-range temporal dynamics.
Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences cs.LG · 2026-05-08 · unverdicted · none · ref 120
Recursive generative retraining with pluralistic preferences converges to a stable diverse distribution that satisfies a weighted Nash bargaining solution.
Perturbation Dose Responses in Recursive LLM Loops: Raw Switching, Stochastic Floors, and Persistent Escape under Append, Replace, and Dialog Updates cs.AI · 2026-05-04 · unverdicted · none · ref 3
In 30-step recursive LLM loops, append-mode persistent escape from source basins reaches 50% near 400 tokens under full history but plateaus below 50% under tail-clip memory policy, while replace-mode switching largely reflects state reset.
RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience cs.CR · 2026-04-13 · unverdicted · none · ref 5
RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.
Annotations Mitigate Post-Training Mode Collapse cs.CL · 2026-05-11 · unverdicted · none · ref 47
Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.
Stabilizing Unsupervised Self-Evolution of MLLMs via Continuous Softened Retracing reSampling cs.CV · 2026-04-04 · unverdicted · none · ref 2
CSRS improves MLLM self-evolution stability by using retracing mechanisms and softened continuous rewards instead of majority voting, reaching SOTA on geometric reasoning benchmarks like MathVision.
Reinforced Self-Training (ReST) for Language Modeling cs.CL · 2023-08-17 · unverdicted · none · ref 22
ReST improves LLM translation quality on benchmarks via offline RL on self-generated data, achieving gains in a compute-efficient way compared to typical RLHF.
Textbooks Are All You Need cs.CL · 2023-06-20 · unverdicted · none · ref 27
A 1.3B-parameter code model trained on 7B tokens of curated textbook and synthetic data achieves 50.6% on HumanEval, indicating data quality can enable strong performance at small scale.
AgentSim: A Platform for Verifiable Agent-Trace Simulation cs.IR · 2026-04-29 · unverdicted · none · ref 16
AgentSim creates and releases the Agent-Trace Corpus of over 103,000 verifiable reasoning steps across three IR benchmarks with claimed 100% grounding on substantive answers.
Position: No Retroactive Cure for Infringement during Training cs.CR · 2026-04-20 · unverdicted · none · ref 19
Post-hoc mitigation cannot retroactively cure infringement that occurred during unauthorized data ingestion and training because liability attaches to data lineage and retained expressive value in model weights.

arXiv preprint arXiv:2305.17493

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer