pith. sign in

arxiv: 2605.30524 · v1 · pith:IAMCMCRSnew · submitted 2026-05-28 · 💻 cs.LG

Representation Collapse in Sequential Post-Training of Large Language Models

Pith reviewed 2026-06-29 08:08 UTC · model grok-4.3

classification 💻 cs.LG
keywords representation collapsesequential post-traininglarge language modelsplasticityout-of-domain generalizationcalibrationLoRA updatesmeasurement suite
0
0 comments X

The pith

Sequential post-training compresses LLM internal representations into low-rank spaces that limit later plasticity, generalization, and calibration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether chains of post-training stages on large language models gradually compress hidden representations into low-rank, anisotropic, and homogeneous spaces. It introduces a measurement suite covering hidden states, logits, token trajectories, and LoRA updates, applied across supervised fine-tuning, preference optimization, safety tuning, and specialization under controlled stage orders. If the central hypothesis holds, this concentration directly reduces how flexibly the model can adapt in subsequent stages, how well it handles out-of-domain inputs, and how accurately its output probabilities reflect true uncertainty. The authors also test simple countermeasures such as mixed-domain replay and diversity regularization to retain behavioral gains while keeping representations more open for future learning.

Core claim

Sequential post-training causes representation collapse, measured as progressive reduction in rank, increase in anisotropy, and rise in homogeneity across hidden states, logits, token paths, and parameter updates. This collapse is not merely geometric but predicts measurable drops in plasticity during later adaptation stages, weaker performance on out-of-distribution tasks, and degraded probability calibration. Controlled experiments varying stage order show that certain sequences accelerate the collapse while others slow it, and lightweight interventions including replay buffers and regularization terms can reduce collapse without erasing the gains from each post-training step.

What carries the argument

The measurement suite tracking rank, anisotropy, and homogeneity of hidden states, logits, token trajectories, and LoRA updates, which quantifies representation collapse and its link to reduced future learnability.

If this is right

  • Models that reach higher representation concentration after early post-training stages exhibit measurably lower plasticity when a new task is introduced.
  • Out-of-domain generalization declines as hidden-state homogeneity increases across the sequence of training stages.
  • Probability calibration worsens in proportion to the degree of representation collapse induced by prior stages.
  • Certain orderings of fine-tuning, preference optimization, and specialization accelerate collapse more than others.
  • Mixed-domain replay and diversity regularization preserve measurable future learnability while retaining stage-specific behavioral improvements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training pipelines could insert diversity checks after each major stage to decide whether to continue or reset.
  • The same concentration pattern may appear in non-LLM sequential learning settings such as chained reinforcement learning agents.
  • Routine monitoring of the measurement suite could guide when to apply corrective interventions during production post-training runs.

Load-bearing premise

The defined measurements of hidden states, logits, token trajectories, and LoRA updates isolate representation collapse and establish its causal connection to reduced plasticity and generalization under the tested stage sequences.

What would settle it

Running the same later adaptation stage on two models that differ only in measured representation concentration but show identical plasticity, out-of-domain accuracy, and calibration scores would falsify the claimed predictive link.

Figures

Figures reproduced from arXiv: 2605.30524 by Chenxi Lin, Hao Wang, Jiarui Wu, Mingyu Chen, Rui Zhang, Wei Sun, Xiaoran Xu, Yichen Liu, Yutong Zhou, Yuxin Yang.

Figure 1
Figure 1. Figure 1: Sequential post-training is evaluated as a trajectory rather than as a single final checkpoint. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Layerwise collapse trajectories. Normalized effective rank and anisotropy are tracked [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Objective signatures. The heatmap compares which data and objective types most strongly [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Multi-model and multi-objective results. Collapse and future-learning trends are reported [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Plasticity and mitigation analyses. Left: collapse measured before the future task predicts [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Model-by-domain grid. Each panel fixes one metric and compares four model families [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Token-span analysis. Splitting prompt, early-response, late-response, chain-of-thought, and [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Stage-order analysis. The same objective set is compared under different orders, because [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Calibration and retention analysis. These panels connect representation health to off-domain [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: LoRA subspace analysis. Layerwise overlap, principal angles, and top-rank energy reveal [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Token-budget scaling. Collapse grows with more stage tokens, while target-task gains [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Seed stability. Each line is one random seed for the same stage sequence. Seed-level [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Probe-corpus sensitivity. Fixed, target-domain, general-domain, and generated-output [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Mitigation sweeps. Collapse, target-domain performance, and future-task performance [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗
read the original abstract

Large language models are now adapted through chains of post-training stages rather than through a single instruction-tuning pass. This paper studies whether such sequential post-training gradually compresses internal representations into low-rank, anisotropic, and homogeneous feature spaces. We define a measurement suite for hidden states, logits, token trajectories, and LoRA updates, and we use it to analyze supervised fine-tuning, preference optimization, safety/refusal tuning, math and code specialization, and long chain-of-thought tuning under controlled stage orderings. The central hypothesis is that excessive representation concentration is not merely a geometric curiosity: it predicts reduced plasticity during later adaptation, weaker out-of-domain generalization, and poorer calibration. We further evaluate lightweight interventions, including mixed-domain replay, feature refresh, representation diversity regularization, and LoRA update decorrelation, as ways to preserve future learnability without giving up the behavioral gains of post-training.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines whether sequential post-training stages in LLMs induce representation collapse, characterized by low-rank, anisotropic, and homogeneous feature spaces. It introduces a measurement suite spanning hidden-state anisotropy, logit concentration, token trajectories, and LoRA update norms, applies it across controlled orderings of SFT, preference optimization, safety tuning, math/code specialization, and long CoT, and hypothesizes that such collapse causally reduces later-stage plasticity, OOD generalization, and calibration. Lightweight interventions (mixed-domain replay, feature refresh, diversity regularization, LoRA decorrelation) are evaluated as mitigations that preserve behavioral gains.

Significance. If the measurement suite successfully isolates a causal geometric mechanism rather than stage-order confounders, the work would supply a concrete, testable account of why multi-stage post-training often degrades adaptability and would directly inform practical recipe design for preserving future learnability.

major comments (2)
  1. [Measurement suite section] Measurement suite section: the claim that the defined metrics (hidden-state anisotropy, logit concentration, token trajectories, LoRA norms) establish a causal link between representation collapse and reduced plasticity/OOD performance requires an explicit decoupling experiment. Because stage ordering simultaneously changes cumulative data exposure, optimization trajectory, and effective capacity, any observed correlation could be driven by those factors; the manuscript must show that the geometric signature predicts the downstream metrics even after controlling for the confounders.
  2. [Intervention evaluation section] Intervention evaluation section: the reported gains from replay, feature refresh, and diversity regularization must be accompanied by controls that verify the interventions act through the collapse metrics rather than through other mechanisms (e.g., simply increasing effective data diversity). Without such mediation analysis or matched ablations, it remains unclear whether the interventions succeed by mitigating the hypothesized geometric cause.
minor comments (2)
  1. Notation for the anisotropy and concentration metrics should be defined with explicit formulas and normalization details in the main text rather than deferred to an appendix.
  2. The abstract states the central hypothesis but supplies no quantitative outcomes; the introduction or results section should include a concise summary table of key effect sizes for the collapse–plasticity relationship.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments correctly identify gaps in establishing causality for the collapse-plasticity link and in validating the mechanism of the proposed interventions. We address each point below and commit to revisions that strengthen these aspects without altering the core claims or experimental scope.

read point-by-point responses
  1. Referee: [Measurement suite section] Measurement suite section: the claim that the defined metrics (hidden-state anisotropy, logit concentration, token trajectories, LoRA norms) establish a causal link between representation collapse and reduced plasticity/OOD performance requires an explicit decoupling experiment. Because stage ordering simultaneously changes cumulative data exposure, optimization trajectory, and effective capacity, any observed correlation could be driven by those factors; the manuscript must show that the geometric signature predicts the downstream metrics even after controlling for the confounders.

    Authors: We acknowledge that controlled stage orderings alone do not fully isolate the geometric signature from confounders such as cumulative data exposure and optimization trajectory. In the revision we will add a decoupling experiment that holds total tokens and optimization steps fixed while varying only the presence of collapse-inducing stages (via replay buffers that restore diversity without changing data volume). We will then report partial correlations and regression coefficients showing that collapse metrics remain predictive of plasticity and OOD metrics after these controls. revision: yes

  2. Referee: [Intervention evaluation section] Intervention evaluation section: the reported gains from replay, feature refresh, and diversity regularization must be accompanied by controls that verify the interventions act through the collapse metrics rather than through other mechanisms (e.g., simply increasing effective data diversity). Without such mediation analysis or matched ablations, it remains unclear whether the interventions succeed by mitigating the hypothesized geometric cause.

    Authors: The referee is right that the current intervention results lack explicit mediation or matched ablations. In the revision we will include (i) a mediation analysis regressing downstream gains on both intervention type and measured change in collapse metrics, and (ii) matched ablations that increase data diversity through non-geometric means (e.g., random token shuffling) and show they do not produce the same plasticity or calibration benefits. These additions will be reported alongside the existing tables. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurements and hypotheses remain independent of inputs

full rationale

The paper presents an empirical study defining a measurement suite for hidden states, logits, token trajectories, and LoRA updates, then tests the hypothesis that representation concentration correlates with reduced plasticity, OOD generalization, and calibration under controlled stage orderings. No equations, fitted parameters, or derivations are described that reduce predictions to inputs by construction. The central claim is framed as a testable empirical prediction rather than a self-referential definition or self-citation load-bearing theorem. Interventions are evaluated separately, with no renaming of known results or ansatz smuggling. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract contains no explicit free parameters, axioms, or invented entities; the work is framed as an empirical measurement study.

pith-pipeline@v0.9.1-grok · 5705 in / 1060 out tokens · 30001 ms · 2026-06-29T08:08:34.088418+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 22 canonical work pages · 15 internal anchors

  1. [1]

    Training language models to follow instructions with human feedback

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback....

  2. [2]

    Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V

    Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping H...

  3. [3]

    Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

    Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, ...

  4. [4]

    Constitutional AI: Harmlessness from AI Feedback

    Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, K...

  5. [5]

    Manning, Stefano Ermon, and Chelsea Finn

    Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. InAdvances in Neural Information Processing Systems, 2023

  6. [6]

    Model alignment as prospect theoretic optimization

    Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, and Douwe Kiela. Model alignment as prospect theoretic optimization. InProceedings of ICML, 2024

  7. [7]

    SimPO: Simple preference optimization with a reference-free reward

    Yu Meng, Mengzhou Xia, and Danqi Chen. SimPO: Simple preference optimization with a reference-free reward. InAdvances in Neural Information Processing Systems, 2024

  8. [8]

    ORPO: Monolithic preference optimization without reference model

    Jiwoo Hong, Noah Lee, and James Thorne. ORPO: Monolithic preference optimization without reference model. InProceedings of EMNLP, 2024

  9. [9]

    Qwen2.5 Technical Report

    Qwen Team. Qwen2.5 technical report.arXiv preprint arXiv:2412.15115, 2025

  10. [10]

    The Llama 3 Herd of Models

    AI@Meta. The Llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024

  11. [11]

    Fine-tuning can distort pretrained features and underperform out-of-distribution

    Ananya Kumar, Aditi Raghunathan, Robbie Jones, Tengyu Ma, and Percy Liang. Fine-tuning can distort pretrained features and underperform out-of-distribution. InProceedings of ICLR, 2022

  12. [12]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In Proceedings of ICLR, 2022

  13. [13]

    Intrinsic dimensionality explains the effectiveness of language model fine-tuning

    Armen Aghajanyan, Sonal Gupta, and Luke Zettlemoyer. Intrinsic dimensionality explains the effectiveness of language model fine-tuning. InProceedings of ACL, 2021. 11

  14. [14]

    How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings

    Kawin Ethayarajh. How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings. InProceedings of EMNLP-IJCNLP, 2019

  15. [15]

    All-but-the-top: Simple and effective postprocessing for word representations

    Jiaqi Mu and Pramod Viswanath. All-but-the-top: Simple and effective postprocessing for word representations. InProceedings of ICLR, 2018

  16. [16]

    Isotropy in the contextual embedding space: Clusters and manifolds

    Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. Isotropy in the contextual embedding space: Clusters and manifolds. InProceedings of ICLR, 2021

  17. [17]

    Representation degeneration problem in training natural language generation models

    Jun Gao, Di He, Xu Tan, Tao Qin, Liwei Wang, and Tie-Yan Liu. Representation degeneration problem in training natural language generation models. InProceedings of ICLR, 2019

  18. [18]

    Robert M. French. Catastrophic forgetting in connectionist networks.Trends in Cognitive Sciences, 1999

  19. [19]

    Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell

    James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, An- drei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. Overcoming catastrophic forgetting in neural networks. InProceedings of the National Academy of Sciences, 2017

  20. [20]

    Gradient episodic memory for continual learning

    David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning. InAdvances in Neural Information Processing Systems, 2017

  21. [21]

    Parisi, Ronald Kemker, Jose L

    German I. Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, and Stefan Wermter. Continual lifelong learning with neural networks: A review.Neural Networks, 2019

  22. [22]

    Vardan Papyan, X. Y . Han, and David L. Donoho. Prevalence of neural collapse during the terminal phase of deep learning training. InProceedings of the National Academy of Sciences, 2020

  23. [23]

    X. Y . Han, Vardan Papyan, and David L. Donoho. Neural collapse under mse loss: Proximity to and dynamics on the central path.arXiv preprint arXiv:2106.02073, 2022

  24. [24]

    SimCSE: Simple contrastive learning of sentence embeddings

    Tianyu Gao, Xingcheng Yao, and Danqi Chen. SimCSE: Simple contrastive learning of sentence embeddings. InProceedings of EMNLP, 2021

  25. [25]

    Learning without forgetting

    Zhizhong Li and Derek Hoiem. Learning without forgetting. InIEEE Transactions on Pattern Analysis and Machine Intelligence, 2017

  26. [26]

    Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert. iCaRL: Incremental classifier and representation learning. InProceedings of CVPR, 2017

  27. [27]

    Dokania, Philip H

    Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K. Dokania, Philip H. S. Torr, and Marc’Aurelio Ranzato. Tiny episodic memories in continual learning. InProceedings of ICML Workshop on Multi-Task and Lifelong Reinforcement Learning, 2019

  28. [28]

    Structural features of the fly olfactory circuit mitigate the stability-plasticity dilemma in continual learning.arXiv preprint arXiv:2502.01427, 2025b

    Heming Zou, Yunliang Zang, and Xiangyang Ji. Structural features of the fly olfactory circuit mitigate the stability-plasticity dilemma in continual learning.arXiv preprint arXiv:2502.01427, 2025

  29. [29]

    Suchin Gururangan, Ana Marasovic, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of ACL, 2020

  30. [30]

    Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M

    Jason Wei, Maarten Bosma, Vincent Y . Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V . Le. Finetuned language models are zero-shot learners. In Proceedings of ICLR, 2022

  31. [31]

    Christiano, Jan Leike, Tom B

    Paul F. Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. InAdvances in Neural Information Processing Systems, 2017. 12

  32. [32]

    Ziegler, Ryan Lowe, Chelsea V oss, Alec Radford, Dario Amodei, and Paul F

    Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea V oss, Alec Radford, Dario Amodei, and Paul F. Christiano. Learning to summarize with human feedback. InAdvances in Neural Information Processing Systems, 2020

  33. [33]

    Proximal Policy Optimization Algorithms

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

  34. [34]

    Self-rewarding language models

    Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, and Jason Weston. Self-rewarding language models. InProceedings of ICML, 2024

  35. [35]

    Chain-of-thought prompting elicits reasoning in large language models

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems, 2022

  36. [36]

    Large language models are zero-shot reasoners

    Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. InAdvances in Neural Information Processing Systems, 2022

  37. [37]

    Training Verifiers to Solve Math Word Problems

    Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems.arXiv preprint arXiv:2110.14168, 2021

  38. [38]

    Measuring mathematical problem solving with the MATH dataset

    Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. InProceedings of NeurIPS Datasets and Benchmarks, 2021

  39. [39]

    Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...

  40. [40]

    Program synthesis with large language models

    Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, and Charles Sutton. Program synthesis with large language models. InProceedings of the NeurIPS Workshop on Machine Learning for Programming, 2021

  41. [41]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    DeepSeek-AI. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

  42. [42]

    SVCCA: Singu- lar vector canonical correlation analysis for deep learning dynamics and interpretability

    Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein. SVCCA: Singu- lar vector canonical correlation analysis for deep learning dynamics and interpretability. In Advances in Neural Information Processing Systems, 2017

  43. [43]

    Similarity of neural network representations revisited

    Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. InProceedings of ICML, 2019

  44. [44]

    Heming Zou, Yunliang Zang, Wutong Xu, and Xiangyang Ji. Fly-CL: A fly-inspired framework for enhancing efficient decorrelation and reduced training time in pre-trained model-based continual representation learning.arXiv preprint arXiv:2510.16877, 2025

  45. [45]

    Enhancing pretrained model-based continual representation learning via guided random projection.arXiv preprint arXiv:2603.19145, 2026

    Ruilin Li, Heming Zou, Xiufeng Yan, Zheming Liang, Jie Yang, Chenliang Li, and Xue Yang. Enhancing pretrained model-based continual representation learning via guided random projection.arXiv preprint arXiv:2603.19145, 2026. 13

  46. [46]

    Y . Yang, A. Zeng, and X. Yang. Towards specialized generalists: A multi-task MoE-LoRA framework for domain-specific LLM adaptation.arXiv preprint arXiv:2601.07935, 2026

  47. [47]

    Y . Yang, H. Zhang, M. Li, et al. NeuroLoRA: Context-aware neuromodulation for parameter- efficient multi-task adaptation.arXiv preprint arXiv:2603.12378, 2026

  48. [48]

    Z. Yang, G. Chen, Y . Yang, et al. Disentangling task conflicts in multi-task LoRA via orthogonal gradient projection.arXiv preprint arXiv:2601.09684, 2026

  49. [49]

    TinyLlama: An Open-Source Small Language Model

    Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, and Wei Lu. TinyLlama: An open-source small language model.arXiv preprint arXiv:2401.02385, 2024

  50. [50]

    Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Ma...

  51. [51]

    Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lelio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothee Lacroix, and William El Sayed. Mistral 7b.arXiv preprint arXiv:23...

  52. [52]

    Gemma: Open Models Based on Gemini Research and Technology

    Gemma Team. Gemma: Open models based on gemini research and technology.arXiv preprint arXiv:2403.08295, 2024

  53. [53]

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, et al. Phi-3 technical re- port: A highly capable language model locally on your phone.arXiv preprint arXiv:2404.14219, 2024

  54. [54]

    DeepSeek-V3 Technical Report

    DeepSeek-AI. DeepSeek-V3 technical report.arXiv preprint arXiv:2412.19437, 2024

  55. [55]

    Yi: Open Foundation Models by 01.AI

    01.AI. Yi: Open foundation models by 01.ai.arXiv preprint arXiv:2403.04652, 2024

  56. [56]

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, et al. ChatGLM: A family of large language models from GLM-130B to GLM-4 all tools.arXiv preprint arXiv:2406.12793, 2024

  57. [57]

    FlyLoRA: Boosting task decoupling and parameter efficiency via implicit rank-wise mixture-of-experts

    Heming Zou, Yunliang Zang, Wutong Xu, Yao Zhu, and Xiangyang Ji. FlyLoRA: Boosting task decoupling and parameter efficiency via implicit rank-wise mixture-of-experts. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. A Linearized theory of representation collapse This appendix gives a compact derivation for the theoretic...