Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings
Pith reviewed 2026-06-28 22:52 UTC · model grok-4.3
The pith
Public anchor sentences induce signatures that distinguish and align unordered private replay lists with high probability in federated continual learning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Clients privately produce candidate replay distributions over a shared sentence-embedding space and the server aligns them using signatures induced by public anchor sentences. The anchors provide identifiability for aggregation rather than additional replay data. We prove that, under an observable anchor-signature margin, O(log(N/η)/p) anchors distinguish N candidate list elements with probability at least 1-η, and we give a scoped anchorless non-identifiability result for unordered-label oracle models. Across five seeds the method improves final average task metric by 3.9-5.6 points over the strongest non-CSLR DP baseline at ε=4 under the reported replay-release budget.
What carries the argument
Signatures induced by public anchor sentences on candidate replay lists, which supply the margin needed for canonicalized aggregation of unordered elements.
If this is right
- A logarithmic number of public anchors is sufficient to achieve high-probability identifiability of list elements.
- CSLR produces 3.9-5.6 point gains in final average task metric over non-CSLR DP baselines at ε=4.
- The method outperforms both Hungarian and optimal-transport matchers on the reported benchmarks.
- The privacy guarantee applies directly to the replay-release step.
Where Pith is reading between the lines
- Anchor-based signatures could serve as a general mechanism for list alignment in other private distributed settings where direct ordering is unavailable.
- The anchorless non-identifiability result indicates that some form of external reference is necessary for stable aggregation of unordered private lists.
- Empirical checks of the margin size on new language-model embeddings would test whether the logarithmic bound remains practical at larger N.
Load-bearing premise
An observable margin exists in the anchor signatures that is large enough to distinguish the unordered list elements with the claimed probability.
What would settle it
A concrete embedding space or task set in which the anchor-signature margin is smaller than required, causing the observed distinction success rate to fall below 1-η for the predicted number of anchors.
Figures
read the original abstract
Federated continual learning (FCL) lets distributed clients adapt language-model heads to evolving NLP tasks without sharing raw text. Under user-level differential privacy (DP), replay-based continual learning faces a structural obstacle: clients can release only small noisy lists of candidate replay summaries, and those lists are unordered across clients. We introduce Canonicalized Stable-List Replay (CSLR), where clients privately produce candidate replay distributions over a shared sentence-embedding space and the server aligns them using signatures induced by public anchor sentences. The anchors provide identifiability for aggregation rather than additional replay data. We prove that, under an observable anchor-signature margin, $O(\log(N/\eta)/p)$ anchors distinguish $N$ candidate list elements with probability at least $1-\eta$, and we give a scoped anchorless non-identifiability result for unordered-label oracle models. Across five seeds on continual classification, NER, and dialogue benchmarks, CSLR improves the final average task metric by 3.9--5.6 points over the strongest non-CSLR DP baseline at $\eps=4$ under the reported replay-release budget, while also outperforming Hungarian and optimal-transport matchers. The formal privacy guarantee covers replay release; end-to-end private training additionally requires composition with a private optimizer for task-head updates.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Canonicalized Stable-List Replay (CSLR) for user-level DP federated continual learning on LM embeddings. Clients release small noisy unordered candidate replay lists over a shared embedding space; the server aligns them via signatures from public anchor sentences. Under an observable anchor-signature margin the authors prove that O(log(N/η)/p) anchors distinguish N list elements with probability ≥1-η, and they supply a complementary anchorless non-identifiability result. On continual classification, NER and dialogue benchmarks CSLR yields 3.9–5.6 point gains in final average task metric over the strongest non-CSLR DP baseline at ε=4 under the reported replay budget, while also beating Hungarian and OT matchers. The formal privacy guarantee is scoped to replay release.
Significance. If the anchor-margin assumption is satisfied and observable in the relevant embedding spaces, the method supplies a clean identifiability mechanism that avoids extra replay data and yields a concrete sample-complexity bound. The reported empirical improvements under realistic replay budgets would then constitute a practically relevant advance for private FCL.
major comments (2)
- [Abstract] Abstract (and the proof of the O(log(N/η)/p) bound): the identifiability result and the justification for server-side aggregation are conditioned on the existence of an observable positive anchor-signature margin, yet the manuscript reports no measurement of signature distances, margin size, or verification that the margin is large enough to achieve the claimed probability on the sentence-embedding spaces of the classification/NER/dialogue tasks. Without this verification the link between the theoretical bound and the 3.9–5.6 point gains is unsupported.
- [Experimental section] The experimental section (five seeds, reported replay-release budget): the paper states that end-to-end privacy requires separate composition with a private optimizer, but provides no access to exact data splits, replay budgets per client, or the full derivation of the anchor margin used in the bound; this weakens the claim that the observed gains are attributable to the canonicalization step rather than other implementation choices.
minor comments (2)
- Notation for the anchor margin p and the probability η should be introduced with a dedicated definition before the bound statement.
- [Abstract] The abstract claims “outperforming Hungarian and optimal-transport matchers” but does not specify whether these baselines also operate under the same DP replay-release budget; a clarifying sentence would help.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We address the major comments below, providing clarifications and committing to revisions where appropriate to strengthen the connection between theory and experiments.
read point-by-point responses
-
Referee: [Abstract] Abstract (and the proof of the O(log(N/η)/p) bound): the identifiability result and the justification for server-side aggregation are conditioned on the existence of an observable positive anchor-signature margin, yet the manuscript reports no measurement of signature distances, margin size, or verification that the margin is large enough to achieve the claimed probability on the sentence-embedding spaces of the classification/NER/dialogue tasks. Without this verification the link between the theoretical bound and the 3.9–5.6 point gains is unsupported.
Authors: We agree that explicit empirical verification of the anchor-signature margin would strengthen the manuscript. The theoretical result is conditioned on an observable positive margin, and while the empirical gains are consistent with the assumption holding in practice for the used embeddings, the manuscript does not report measurements of signature distances or margin sizes. In the revision, we will add a new subsection or appendix with these measurements on the relevant embedding spaces, including average inter-signature distances and estimated margins, to verify that the margin supports the claimed probability bounds. revision: yes
-
Referee: [Experimental section] The experimental section (five seeds, reported replay-release budget): the paper states that end-to-end privacy requires separate composition with a private optimizer, but provides no access to exact data splits, replay budgets per client, or the full derivation of the anchor margin used in the bound; this weakens the claim that the observed gains are attributable to the canonicalization step rather than other implementation choices.
Authors: We note that the paper uses publicly available benchmark datasets with standard train/test splits (details referenced in the experimental setup), and reports the overall replay-release budget. However, to address the concern about reproducibility and attribution of gains, we will expand the experimental section and add an appendix with per-client replay budgets, exact data split information, and a more detailed derivation of the specific anchor margin parameter p used in the bound application for the experiments. This will help isolate the contribution of the canonicalization step. revision: partial
Circularity Check
No significant circularity; bound is conditional and results are measured
full rationale
The paper states the O(log(N/η)/p) identifiability bound explicitly under the 'observable anchor-signature margin' assumption and supplies a complementary anchorless non-identifiability result. No equations or steps reduce the claimed result to its own inputs by construction, no parameters are fitted on a subset and then relabeled as predictions, and no self-citation chain is invoked as load-bearing justification. Empirical gains (3.9-5.6 points) are presented as measured experimental outcomes across seeds, not quantities defined by the proof parameters. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption There exists an observable anchor-signature margin that enables identifiability of candidate replay elements
Reference graph
Works this paper leans on
-
[1]
Proceedings of the 38th International Conference on Machine Learning , series =
Jaehong Yoon and Wonyong Jeong and Giwoong Lee and Eunho Yang and Sung Ju Hwang , title =. Proceedings of the 38th International Conference on Machine Learning , series =
-
[2]
International Conference on Learning Representations , year =
Daiqing Qi and Handong Zhao and Sheng Li , title =. International Conference on Learning Representations , year =
-
[3]
Advances in Neural Information Processing Systems , year =
Sara Babakniya and Zalan Fabian and Chaoyang He and Mahdi Soltanolkotabi and Salman Avestimehr , title =. Advances in Neural Information Processing Systems , year =
-
[4]
Advances in Neural Information Processing Systems , year =
David Rolnick and Arun Ahuja and Jonathan Schwarz and Timothy Lillicrap and Gregory Wayne , title =. Advances in Neural Information Processing Systems , year =
-
[6]
Proceedings of The 36th International Conference on Algorithmic Learning Theory , series =
Mohammad Afzali and Hassan Ashtiani and Christopher Liaw , title =. Proceedings of The 36th International Conference on Algorithmic Learning Theory , series =
-
[7]
Advances in Neural Information Processing Systems , volume =
Badih Ghazi and Ravi Kumar and Pasin Manurangsi , title =. Advances in Neural Information Processing Systems , volume =
-
[8]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =
Hao Yu and Xin Yang and Le Zhang and Hanlin Gu and Tianrui Li and Lixin Fan and Qiang Yang , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =
-
[12]
Proceedings of the 41st International Conference on Machine Learning , series =
Hongming Piao and Yichen Wu and Dapeng Wu and Ying Wei , title =. Proceedings of the 41st International Conference on Machine Learning , series =
-
[13]
2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings , series =
Junyan Ouyang and Rui Han and Chi Harold Liu , title =. 2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings , series =. 2023 , doi =
2023
-
[14]
Brendan McMahan and Ilya Mironov and Kunal Talwar and Li Zhang , title =
Martin Abadi and Andy Chu and Ian Goodfellow and H. Brendan McMahan and Ilya Mironov and Kunal Talwar and Li Zhang , title =. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security , year =
2016
-
[15]
Brendan McMahan and Daniel Ramage and Kunal Talwar and Li Zhang , title =
H. Brendan McMahan and Daniel Ramage and Kunal Talwar and Li Zhang , title =. International Conference on Learning Representations , year =
-
[16]
Proceedings of the 30th IEEE Computer Security Foundations Symposium , pages =
Ilya Mironov , title =. Proceedings of the 30th IEEE Computer Security Foundations Symposium , pages =
-
[17]
Large Language Models Can Be Strong Differentially Private Learners , journal =
Xuechen Li and Florian Tram\`. Large Language Models Can Be Strong Differentially Private Learners , journal =
-
[18]
Findings of the Association for Computational Linguistics: EMNLP , year =
Lingjuan Lyu and Xuanli He and Yitong Li and Kim-Kwang Raymond Choo and Haibo Hu , title =. Findings of the Association for Computational Linguistics: EMNLP , year =
-
[20]
Proceedings of the AAAI Conference on Artificial Intelligence , year =
Yue Tan and Guodong Long and Jie Ma and Lu Liu and Tianyi Zhou and Jing Jiang , title =. Proceedings of the AAAI Conference on Artificial Intelligence , year =
-
[21]
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing , year =
Nils Reimers and Iryna Gurevych , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing , year =
2019
-
[22]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =
Qinbin Li and Bingsheng He and Dawn Song , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =
-
[23]
Advances in Neural Information Processing Systems , volume =
Mansheej Paul and Surya Ganguli and Gintare Karolina Dziugaite , title =. Advances in Neural Information Processing Systems , volume =
-
[25]
Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang
Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security
2016
-
[26]
Mohammad Afzali, Hassan Ashtiani, and Christopher Liaw. 2025. Agnostic private density estimation for GMMs via list global stability. In Proceedings of The 36th International Conference on Algorithmic Learning Theory, volume 272 of Proceedings of Machine Learning Research, pages 41--66. PMLR
2025
-
[27]
Sara Babakniya, Zalan Fabian, Chaoyang He, Mahdi Soltanolkotabi, and Salman Avestimehr. 2023. A data-free approach to mitigate catastrophic forgetting in federated class incremental learning for vision tasks. In Advances in Neural Information Processing Systems
2023
- [28]
-
[29]
Badih Ghazi, Ravi Kumar, and Pasin Manurangsi. 2021. User-level private learning via correlated sampling. In Advances in Neural Information Processing Systems, volume 34, pages 20172--20184
2021
-
[30]
Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10713--10722
2021
- [31]
- [32]
- [33]
-
[34]
Lingjuan Lyu, Xuanli He, Yitong Li, Kim-Kwang Raymond Choo, and Haibo Hu. 2020. Differentially private representation for NLP : Formal guarantee and an empirical study on privacy and fairness. In Findings of the Association for Computational Linguistics: EMNLP
2020
-
[35]
Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang
H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning differentially private recurrent language models. In International Conference on Learning Representations
2018
-
[36]
Yongsheng Mei, Liangqi Yuan, Dong-Jun Han, Kevin S. Chan, Christopher G. Brinton, and Tian Lan. 2024. Using diffusion models as generative replay in continual federated learning -- what will happen? arXiv preprint arXiv:2411.06618
- [37]
-
[38]
Ilya Mironov. 2017. R\' e nyi differential privacy. In Proceedings of the 30th IEEE Computer Security Foundations Symposium, pages 263--275
2017
-
[39]
Junyan Ouyang, Rui Han, and Chi Harold Liu. 2023. https://doi.org/10.1109/VTC2023-Fall60731.2023.10333463 Evaluating differential privacy in federated continual learning . In 2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings, IEEE Vehicular Technology Conference. IEEE
-
[40]
Mansheej Paul, Surya Ganguli, and Gintare Karolina Dziugaite. 2021. Deep learning on a data diet: Finding important examples early in training. In Advances in Neural Information Processing Systems, volume 34, pages 20596--20607
2021
-
[41]
Hongming Piao, Yichen Wu, Dapeng Wu, and Ying Wei. 2024. Federated continual learning via prompt-based dual knowledge transfer. In Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pages 40725--40739. PMLR
2024
-
[42]
Daiqing Qi, Handong Zhao, and Sheng Li. 2023. Better generative replay for continual federated learning. In International Conference on Learning Representations
2023
-
[43]
Nils Reimers and Iryna Gurevych. 2019. Sentence- BERT : Sentence embeddings using siamese BERT -networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing
2019
-
[44]
David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. 2019. Experience replay for continual learning. In Advances in Neural Information Processing Systems
2019
-
[45]
Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang. 2022. FedProto : Federated prototype learning across heterogeneous clients. In Proceedings of the AAAI Conference on Artificial Intelligence
2022
-
[46]
Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, and Sung Ju Hwang. 2021. Federated continual learning with weighted inter-client transfer. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12073--12086. PMLR
2021
-
[47]
Hao Yu, Xin Yang, Le Zhang, Hanlin Gu, Tianrui Li, Lixin Fan, and Qiang Yang. 2025. Addressing spatial-temporal data heterogeneity in federated continual learning via tail anchor. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
2025
-
[48]
Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Huan Sun, David Levitan, Robert Sim, and Salil Vadhan. 2023. Synthetic text generation with differential privacy: A simple and practical recipe. arXiv preprint arXiv:2210.14348
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.