pith. sign in

arxiv: 2408.07587 · v4 · submitted 2024-08-14 · 💻 cs.LG · cs.DC

FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher

Pith reviewed 2026-05-23 21:46 UTC · model grok-4.3

classification 💻 cs.LG cs.DC
keywords federated learningmachine unlearningknowledge distillationon-device computationdata privacyFedAvg protocolforget data
0
0 comments X

The pith

FedQUIT lets clients unlearn their data on-device in federated learning by distilling from a modified global model without extra protocol assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FedQUIT, a method for on-device unlearning in federated learning that lets a client remove the influence of its own data from the shared global model. It does this through knowledge distillation where the client model learns from a virtual teacher created by altering the global model's output probabilities on the client's forget data. The alteration lowers confidence in the true labels while keeping the relative ordering among the other classes intact. This approach requires no changes to the standard FedAvg aggregation rule and produces unlearning results that match or beat six existing methods while cutting communication and compute costs versus full retraining from scratch.

Core claim

FedQUIT achieves unlearning in federated learning by having the requesting client use a virtual teacher obtained by manipulating the global model's outputs on forget data—penalizing true-class confidence while preserving non-true class relationships—to train its local model via knowledge distillation, thereby removing its data's influence without additional assumptions beyond FedAvg.

What carries the argument

The quasi-competent virtual teacher created by selective output manipulation on forget data inside a teacher-student distillation loop where the client's local model is the student.

Load-bearing premise

Penalizing true-class confidence on forget data while preserving non-true class relationships in the global model is enough to make the client model forget without harming its overall generalization under standard FedAvg.

What would settle it

A centralized evaluation showing that the updated global model still achieves high accuracy when tested on the forget client's data after FedQUIT completes would falsify the unlearning claim.

Figures

Figures reproduced from arXiv: 2408.07587 by Alessio Mora, Andrea Passarella, Lorenzo Valerio, Paolo Bellavista.

Figure 1
Figure 1. Figure 1: FedQUIT overview. (1) Regular training via FedAvg [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FedQUIT-Logits. datasets supposed to exist. Next, we explain our original approaches in full detail. FedQUIT-Logits. Indicating (for ease of notation) the global model at round t as wt and its output probability as gt(x), we design a modified output probability g ′ t (x). With￾out loss of generality, we will omit the t index to simplify notation. FedQUIT-Logits sets to a fixed value of v the true￾class log… view at source ↗
Figure 3
Figure 3. Figure 3: Test accuracy degradation after unlearning a client’s data [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of FedQUIT and PGA performance across settings. A smaller polygon indicates better unlearning effectiveness. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Test Accuracy (Left) and Forget Accuracy (Right) for a representative client on CIFAR-100, ResNet-18, Non-IID, E = 1. FedQUIT minimizes test accuracy loss, demonstrating more selective removal of client contributions. Performance consistency during the recovery phase. Due to its lower initial degradation, FedQUIT maintains a more functional global model throughout the recovery phase [PITH_FULL_IMAGE:figur… view at source ↗
Figure 6
Figure 6. Figure 6: Label distribution across clients (0-9) for CIFAR-10 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Test Accuracy (Left) and Forget Accuracy (Right) for a representative client on CIFAR-100, ResNet-18, Non-IID, E = 1. 0 5 10 15 Recovery Rounds 0 10 20 30 40 50 60 70 80 Test Accuracy (%) Original Model Retrained Model FedQUIT PGA 0 5 10 15 Recovery Rounds 0 10 20 30 40 50 60 70 80 Forget Accuracy (%) [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Test Accuracy (Left) and Forget Accuracy (Right) for a representative. Setting: ResNet-18, CIFAR-100, IID, E = 1. MiT-B0 on CIFAR-100. FedQUIT (all variants), Incom￾petent Teacher: 1 unlearning epochs, learning rate 5e-4, AdamW optimizer, local batch size 32. PGA [12]: we re￾port in [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Test Accuracy (Left) and Forget Accuracy (Right) for a representative client. Setting: miT-B0, CIFAR-100, Non-IID, E = 1. Algorithm Rounds (↓) CR (↑) Test Acc. Forget Acc. MIA [29] Original 75.03 ±0.00 84.25 ±5.63 77.03 ±8.57 Retrained 73.30 ±0.78 57.80 ±5.29 48.59 ±4.96 PGA [12] 6.9 ±4.28 7.2× 73.60 ±0.36 62.15 (4.35 ±3.28) 53.57 (4.98 ±3.67) FedQUIT 8.78 ±2.54 5.7× 73.42 ±0.89 58.50 (1.86 ±1.46) 48.86 (2… view at source ↗
Figure 10
Figure 10. Figure 10: report the degradation after unlearning and before recovery for the centralized setting. Natural baseline (federated) results [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
read the original abstract

Federated Learning (FL) enables the collaborative training of machine learning models without requiring centralized collection of user data. To comply with the right to be forgotten, FL clients should be able to request the removal of their data contributions from the global model. In this paper, we propose FedQUIT, a novel unlearning algorithm that operates directly on client devices that request to remove its contribution. Our method leverages knowledge distillation to remove the influence of the target client's data from the global model while preserving its generalization ability. FedQUIT adopts a teacher-student framework, where a modified version of the current global model serves as a virtual teacher and the client's model acts as the student. The virtual teacher is obtained by manipulating the global model's outputs on forget data, penalizing the confidence assigned to the true class while preserving relationships among outputs of non-true classes, to simultaneously induce forgetting and retain useful knowledge. As a result, FedQUIT achieves unlearning without making any additional assumption over the standard FedAvg protocol. Evaluation across diverse datasets, data heterogeneity levels, and model architectures shows that FedQUIT achieves superior or comparable unlearning efficacy compared to six state-of-the-art methods, while significantly reducing cumulative communication and computational overhead relative to retraining from scratch.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes FedQUIT, an on-device federated unlearning algorithm for the right to be forgotten in FL. It uses a teacher-student knowledge distillation setup where the client's local model is the student and a modified version of the current global model serves as a virtual teacher; the teacher is created by manipulating outputs on forget data to penalize true-class confidence while preserving relative non-true class outputs. The method claims to achieve effective unlearning under the standard FedAvg protocol with no extra assumptions, and reports superior or comparable unlearning efficacy to six SOTA baselines across datasets, heterogeneity levels, and architectures, while cutting cumulative communication and compute relative to retraining from scratch.

Significance. If the core heuristic is shown to reliably excise client influence without degrading retain-set performance or requiring protocol changes, the result would be significant for practical deployment of unlearning in federated systems, as it avoids the high cost of full retraining and operates locally on the requesting client.

major comments (2)
  1. [Abstract / §3] Abstract (and §3, virtual-teacher construction): the central claim that FedQUIT requires 'no additional assumption over the standard FedAvg protocol' is load-bearing, yet the manuscript provides no derivation or bound showing why penalizing only the true-class logit on forget samples (while keeping non-true relationships) suffices to remove the client's contribution after aggregation; under non-IID FedAvg the global outputs on a single client's forget set may already be entangled with other clients' data, so the heuristic risks leaving residual influence or harming generalization on retain data.
  2. [Evaluation] Evaluation sections: the reported superiority over six baselines is presented without explicit controls for the exact definition of the manipulation parameter, statistical significance across runs, or ablation isolating the effect of the non-true-class preservation step; without these, it is unclear whether the claimed reduction in overhead is robust or an artifact of the chosen heterogeneity levels.
minor comments (2)
  1. [§3] Notation for the output manipulation (e.g., how the penalized logits are exactly computed and scaled) should be formalized with an equation rather than prose description.
  2. [Related Work] The paper should include a short related-work table contrasting FedQUIT's on-device, no-extra-assumption property against the six baselines.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract / §3] Abstract (and §3, virtual-teacher construction): the central claim that FedQUIT requires 'no additional assumption over the standard FedAvg protocol' is load-bearing, yet the manuscript provides no derivation or bound showing why penalizing only the true-class logit on forget samples (while keeping non-true relationships) suffices to remove the client's contribution after aggregation; under non-IID FedAvg the global outputs on a single client's forget set may already be entangled with other clients' data, so the heuristic risks leaving residual influence or harming generalization on retain data.

    Authors: FedQUIT performs unlearning entirely locally on the requesting client using only the global model received under standard FedAvg, with no protocol changes or extra information required from the server. The virtual teacher is constructed by a local manipulation that reduces true-class confidence on forget samples while preserving relative outputs among non-true classes; this is presented as a practical heuristic rather than a theoretically bounded procedure. Experiments across non-IID partitions show effective unlearning without retain-set degradation, supporting that the approach does not introduce new assumptions. We will revise §3 and the discussion to clarify the heuristic motivation and explicitly note the absence of a formal bound on residual influence. revision: partial

  2. Referee: [Evaluation] Evaluation sections: the reported superiority over six baselines is presented without explicit controls for the exact definition of the manipulation parameter, statistical significance across runs, or ablation isolating the effect of the non-true-class preservation step; without these, it is unclear whether the claimed reduction in overhead is robust or an artifact of the chosen heterogeneity levels.

    Authors: We agree that the current evaluation would benefit from these controls. The revised manuscript will report the precise manipulation parameter values, include mean and standard deviation over multiple independent runs to establish statistical significance, and add an ablation isolating the non-true-class preservation component. These additions will demonstrate that the overhead reductions hold across the tested heterogeneity levels. revision: yes

standing simulated objections not resolved
  • A formal derivation or bound establishing that the local heuristic suffices to excise client influence after aggregation under non-IID FedAvg.

Circularity Check

0 steps flagged

No load-bearing circularity; unlearning heuristic presented as independent of fitted inputs or self-citations

full rationale

The paper claims FedQUIT works under unmodified FedAvg with a virtual teacher obtained by output manipulation on forget data. No equations, self-citations, or ansatzes are shown reducing the unlearning efficacy or the 'no additional assumptions' claim to a quantity defined inside the paper. Evaluation uses external baselines and diverse datasets. This matches the default non-circular outcome for a method whose central step is a stated heuristic rather than a derived identity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of the described output manipulation under standard federated averaging assumptions; no explicit free parameters or invented entities are named in the abstract, though the manipulation itself likely involves at least one tunable penalty strength that is not detailed here.

axioms (1)
  • domain assumption Standard assumptions of the FedAvg protocol are sufficient for the unlearning procedure to succeed without further modeling choices.
    The abstract explicitly states that FedQUIT makes no additional assumptions over FedAvg.

pith-pipeline@v0.9.0 · 5755 in / 1482 out tokens · 47738 ms · 2026-05-23T21:46:46.693216+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Asynchronous Federated Unlearning with Invariance Calibration for Medical Imaging

    cs.LG 2026-04 unverdicted novelty 5.0

    AFU-IC decouples client unlearning from global federated training in medical imaging and adds server-side invariance calibration to prevent relearning of erased data.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Get rid of your trail: Remotely erasing backdoors in federated learning

    Manaar Alam, Hithem Lamri, and Michail Maniatakos. Get rid of your trail: Remotely erasing backdoors in federated learning. arXiv preprint arXiv:2304.10638, 2023. 2

  2. [2]

    Decen- tralised Learning in Federated Deployment Environments: A System-Level Survey

    Paolo Bellavista, Luca Foschini, and Alessio Mora. Decen- tralised Learning in Federated Deployment Environments: A System-Level Survey. ACM Computing Surveys (CSUR), 54 (1):1–38, 2021. 1

  3. [3]

    Model compression

    Cristian Bucilu ˇa, Rich Caruana, and Alexandru Niculescu- Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 535–541, 2006. 2

  4. [4]

    Fedrecover: Recovering from poisoning attacks in federated learning using historical information

    Xiaoyu Cao, Jinyuan Jia, Zaixi Zhang, and Neil Zhenqiang Gong. Fedrecover: Recovering from poisoning attacks in federated learning using historical information. In 2023 IEEE Symposium on Security and Privacy (SP), pages 1366– 1383, 2023. 2

  5. [5]

    Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher

    Vikram S Chundawat, Ayush K Tarun, Murari Mandal, and Mohan Kankanhalli. Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher. In Proceedings of the AAAI Conference on Artificial Intelli- gence, pages 7210–7217, 2023. 2, 8, 3, 4, 6, 7

  6. [6]

    Large scale distributed deep networks

    Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc’aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, et al. Large scale distributed deep networks. In Advances in neural information process- ing systems, pages 1223–1231, 2012. 2

  7. [7]

    Regulation (EU) 2016/679 of the European Parliament and of the Council, 2016

    European Parliament and Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council, 2016. 1

  8. [8]

    Eternal sunshine of the spotless net: Selective forgetting in deep networks

    Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 9304– 9312, 2020. 1

  9. [9]

    Ferrari: federated feature unlearning via optimizing feature sensitivity

    Hanlin Gu, WinKent Ong, Chee Seng Chan, and Lixin Fan. Ferrari: federated feature unlearning via optimizing feature sensitivity. Advances in Neural Information Processing Sys- tems, 37:24150–24180, 2025. 3

  10. [10]

    Not all minorities are equal: Empty- class-aware distillation for heterogeneous federated learning

    Kuangpu Guo, Yuhe Ding, Jian Liang, Ran He, Zilei Wang, and Tieniu Tan. Not all minorities are equal: Empty- class-aware distillation for heterogeneous federated learning. arXiv preprint arXiv:2401.02329, 2024. 1, 2

  11. [11]

    FAST: Adopt- ing Federated Unlearning to Eliminating Malicious Termi- nals at Server Side

    Xintong Guo, Pengfei Wang, Sen Qiu, Wei Song, Qiang Zhang, Xiaopeng Wei, and Dongsheng Zhou. FAST: Adopt- ing Federated Unlearning to Eliminating Malicious Termi- nals at Server Side. IEEE Transactions on Network Science and Engineering, pages 1–14, 2023. 2

  12. [12]

    Federated unlearning: How to effi- ciently erase a client in fl? arXiv preprint arXiv:2207.05521,

    Anisa Halimi, Swanand Kadhe, Ambrish Rawat, and Nathalie Baracaldo. Federated unlearning: How to effi- ciently erase a client in fl? arXiv preprint arXiv:2207.05521,

  13. [13]

    Deep Residual Learning for Image Recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In Proc. of IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 770–778, 2016. 5, 4

  14. [14]

    Distilling the Knowledge in a Neural Network

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015. 4, 2

  15. [15]

    Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification

    Tzu-Ming Harry Hsu, Hang Qi, and Matthew Brown. Mea- suring the effects of non-identical data distribution for feder- ated visual classification. arXiv preprint arXiv:1909.06335,

  16. [16]

    Advances and open problems in federated learn- ing

    Peter Kairouz, H Brendan McMahan, Brendan Avent, Aur´elien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cum- mings, et al. Advances and open problems in federated learn- ing. Foundations and trends® in machine learning, 14(1–2): 1–210, 2021. 1, 2

  17. [17]

    Multi-level branched regularization for federated learning

    Jinkyu Kim, Geeho Kim, and Bohyung Han. Multi-level branched regularization for federated learning. In Inter- national Conference on Machine Learning , pages 11058– 11073. PMLR, 2022. 4

  18. [18]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009. 5

  19. [19]

    Preservation of the global knowledge by not- true distillation in federated learning

    Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, and Se-Young Yun. Preservation of the global knowledge by not- true distillation in federated learning. In Advances in Neural Information Processing Systems, 2022. 1, 2

  20. [20]

    Federaser: Enabling efficient client-level data removal from federated learning models

    Gaoyang Liu, Xiaoqiang Ma, Yang Yang, Chen Wang, and Jiangchuan Liu. Federaser: Enabling efficient client-level data removal from federated learning models. In 2021 IEEE/ACM 29th International Symposium on Quality of Ser- vice (IWQOS), pages 1–10, 2021. 2, 3

  21. [21]

    Model spar- sity can simplify machine unlearning

    Jiancheng Liu, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, PRANAY SHARMA, Sijia Liu, et al. Model spar- sity can simplify machine unlearning. Advances in Neural Information Processing Systems, 36, 2024. 6

  22. [22]

    The right to be forgotten in federated learning: An efficient real- ization with rapid retraining

    Yi Liu, Lei Xu, Xingliang Yuan, Cong Wang, and Bo Li. The right to be forgotten in federated learning: An efficient real- ization with rapid retraining. InIEEE INFOCOM 2022-IEEE Conference on Computer Communications , pages 1749–

  23. [23]

    Federated learning with label- masking distillation

    Jianghu Lu, Shikun Li, Kexin Bao, Pengju Wang, Zhenx- ing Qian, and Shiming Ge. Federated learning with label- masking distillation. In Proceedings of the 31st ACM Inter- national Conference on Multimedia , pages 222–232, 2023. 1, 2

  24. [24]

    Communication- efficient learning of deep networks from decentralized data

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication- efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics , pages 1273–1282. PMLR, 2017. 1, 2, 4

  25. [25]

    Knowledge distillation in federated learning: A prac- tical guide

    Alessio Mora, Irene Tenison, Paolo Bellavista, and Irina Rish. Knowledge distillation in federated learning: A prac- tical guide. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, pages 8188–8196. International Joint Conferences on Artificial In- telligence Organization, 2024. Survey Track. 2

  26. [26]

    Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Kone ˇcn´y, Sanjiv Kumar, and Hugh Brendan McMahan

    Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Kone ˇcn´y, Sanjiv Kumar, and Hugh Brendan McMahan. Adaptive federated optimization. In 9th International Conference on Learning Representa- tions, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 . OpenReview.net, 2021. 3

  27. [27]

    Federated unlearning: A survey on methods, design guidelines, and evaluation met- rics

    Nicol `o Romandini, Alessio Mora, Carlo Mazzocca, Rebecca Montanari, and Paolo Bellavista. Federated unlearning: A survey on methods, design guidelines, and evaluation met- rics. IEEE Transactions on Neural Networks and Learning Systems, pages 1–21, 2024. 2, 3, 6, 7

  28. [28]

    Membership inference attacks against machine learning models

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), pages 3–18. IEEE, 2017. 1

  29. [29]

    Systematic evaluation of pri- vacy risks of machine learning models

    Liwei Song and Prateek Mittal. Systematic evaluation of pri- vacy risks of machine learning models. In 30th USENIX Se- curity Symposium (USENIX Security 21), pages 2615–2632,

  30. [30]

    Privacy risks of securing machine learning models against adversarial ex- amples

    Liwei Song, Reza Shokri, and Prateek Mittal. Privacy risks of securing machine learning models against adversarial ex- amples. In Proceedings of the 2019 ACM SIGSAC Confer- ence on Computer and Communications Security, CCS 2019, London, UK, November 11-15, 2019, pages 241–257. ACM,

  31. [31]

    Federated Unlearning via Class-Discriminative Pruning

    Junxiao Wang, Song Guo, Xin Xie, and Heng Qi. Federated Unlearning via Class-Discriminative Pruning. In Proceed- ings of the ACM Web Conference 2022, page 622–632, New York, NY , USA, 2022. Association for Computing Machin- ery. 3

  32. [32]

    Federated unlearning with knowledge distillation

    Chen Wu, Sencun Zhu, and Prasenjit Mitra. Federated unlearning with knowledge distillation. arXiv preprint arXiv:2201.09441, 2022. 2, 3

  33. [33]

    Segformer: Simple and efficient design for semantic segmentation with transform- ers

    Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transform- ers. Advances in Neural Information Processing Systems , 34:12077–12090, 2021. 5, 4

  34. [34]

    Machine unlearning: A survey.ACM Computing Surveys, 56(1):1–36, 2023

    Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, and Philip S Yu. Machine unlearning: A survey.ACM Computing Surveys, 56(1):1–36, 2023. 1

  35. [35]

    Local- global knowledge distillation in heterogeneous federated learning with non-iid data.arXiv preprint arXiv:2107.00051,

    Dezhong Yao, Wanning Pan, Yutong Dai, Yao Wan, Xi- aofeng Ding, Hai Jin, Zheng Xu, and Lichao Sun. Local- global knowledge distillation in heterogeneous federated learning with non-iid data.arXiv preprint arXiv:2107.00051,

  36. [36]

    Privacy risk in machine learning: Analyzing the connection to overfitting

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st computer se- curity foundations symposium (CSF), pages 268–282. IEEE,

  37. [37]

    6, 2 FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher Supplementary Material

  38. [38]

    Inspiring Observations We aim to design an FU method that operates on-device and fully adheres to the FL privacy requirements. This entails that the method would have direct access only to the un- learning client’s data, while the rest of the data in the fed- eration (the retain data) would not be available for use in the unlearning algorithms, except via...

  39. [39]

    We use a crafted version of the FL global model as the teacher, serving as a natural proxy for the retain data that cannot be directly accessed in FL

    Similarly Inspired Work in FL The mechanisms that we present in this paper use a student- teacher framework locally at FL clients to retain the good knowledge from the original model while selectively scrub- bing the contributions to forget. We use a crafted version of the FL global model as the teacher, serving as a natural proxy for the retain data that...

  40. [40]

    Extended Description of Table 1 Historical information. This refers to whether a method requires storing and accessing historical data, such as the complete history of per-client updates, which is typically maintained by the parameter server. Note that: (1) Link- ing per-client update histories to specific clients requesting unlearning undermines FL’s pri...

  41. [41]

    We developed the code with Python and with Python libraries; in our code repository, we provide the instructions to exactly reproduce our Python environment

    Infrastructure and Libraries We run all the experiments on a machine with Ubuntu 22.04, equipped with 64 GB of RAM and one NVIDIA RTX A5000 as GPU (32GB memory). We developed the code with Python and with Python libraries; in our code repository, we provide the instructions to exactly reproduce our Python environment

  42. [42]

    In general, MIA metrics reflect the information leakage of training algorithms about individual members of the train- ing corpus

    Membership Inference Attacks In this section, we briefly describe Shokri’s attack and Yeom’s attack that we use in the experimental results. In general, MIA metrics reflect the information leakage of training algorithms about individual members of the train- ing corpus. A lower MIA success rate implies less informa- tion about Du in wu. Song’s MIA [29]. T...

  43. [43]

    The un- learned model has the exact same model parameters of the 0 1 2 3 4 5 6 7 8 99 Client 0 1 2 3 4 5 6 7 8 9 Label 0 800 1600 2400 3200 (a) CIFAR-10 (Non-IID, α = 0.3)

    Baselines in Experiments Natural baseline (federated baseline): During the un- learning routine, there is no explicit unlearning. The un- learned model has the exact same model parameters of the 0 1 2 3 4 5 6 7 8 99 Client 0 1 2 3 4 5 6 7 8 9 Label 0 800 1600 2400 3200 (a) CIFAR-10 (Non-IID, α = 0.3). 0 1 2 3 4 5 6 7 8 99 Client 0 10 20 30 40 50 60 70 80 ...

  44. [44]

    Hyper-parameter Tuning and Pre- processing In this Section, we report the hyper-parameter tuning of the various methods we used. 14.1. Regular Training (Federated Settings) CIFAR-10/CIFAR-100 Data Distribution. Figure 6 shows the label distribution across clients for the federated CIFAR-10 and CIFAR-100. ResNet-18 on CIFAR-10/CIFAR-100. We used a stan- da...

  45. [45]

    Further Experimental Results In this Section, we include further experimental results that are mentioned in the main paper but excluded for the sake of space. 0 2 4 6 8 Recovery Rounds 0 10 20 30 40 50 60 70 80T est Accuracy (%) Original Model Retrained Model FedQUIT PGA 0 2 4 6 8 Recovery Rounds 0 10 20 30 40 50 60 70 80T est Accuracy (%) Original Model ...