Fine-tuned language models are continual learners.arXiv preprint arXiv:2205.12393

Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan · 2022 · arXiv 2205.12393

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Learning to Discover at Test Time

cs.LG · 2026-01-22 · unverdicted · novelty 7.0

TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.

Robust Policy Optimization to Prevent Catastrophic Forgetting

cs.LG · 2026-02-09 · unverdicted · novelty 6.0

FRPO applies a max-min robust optimization over KL-bounded policy neighborhoods during RLHF to reduce catastrophic forgetting of safety and accuracy under subsequent SFT or RL fine-tuning.

Galactica: A Large Language Model for Science

cs.CL · 2022-11-16 · unverdicted · novelty 5.0 · 2 refs

Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.

citing papers explorer

Showing 3 of 3 citing papers.

Learning to Discover at Test Time cs.LG · 2026-01-22 · unverdicted · none · ref 60
TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.
Robust Policy Optimization to Prevent Catastrophic Forgetting cs.LG · 2026-02-09 · unverdicted · none · ref 45
FRPO applies a max-min robust optimization over KL-bounded policy neighborhoods during RLHF to reduce catastrophic forgetting of safety and accuracy under subsequent SFT or RL fine-tuning.
Galactica: A Large Language Model for Science cs.CL · 2022-11-16 · unverdicted · none · ref 22 · 2 links
Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.

Fine-tuned language models are continual learners.arXiv preprint arXiv:2205.12393

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer