TIGER: Inverting Transformer Gradients via Embedding-Subspace Distance Optimization

Dimitar I. Dimitrov; Ivo Petrov; Martin Vechev; William Kalikman

arxiv: 2606.18312 · v1 · pith:KEQIK4KEnew · submitted 2026-06-16 · 💻 cs.CR · cs.DC· cs.LG

TIGER: Inverting Transformer Gradients via Embedding-Subspace Distance Optimization

William Kalikman , Ivo Petrov , Dimitar I. Dimitrov , Martin Vechev This is my paper

Pith reviewed 2026-06-27 00:22 UTC · model grok-4.3

classification 💻 cs.CR cs.DCcs.LG

keywords gradient inversionfederated learningtransformerprivacy attacksubspace optimizationdifferential privacyinput reconstructionattention gradients

0 comments

The pith

TIGER recovers input tokens from transformer gradients by optimizing their embeddings to the low-rank attention subspace.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TIGER to invert gradients sent in federated learning of transformers. It extracts a low-rank subspace from attention gradients and then runs a continuous optimization that pulls candidate token embeddings toward that subspace. This replaces both expensive dummy-input matching and brittle discrete token tests. The method improves reconstruction on encoders and succeeds on decoders even when differential privacy adds noise to the gradients.

Core claim

TIGER identifies the subspace spanned by the low-rank attention gradients and directly minimizes the distance of token embeddings to this subspace, converting the reconstruction task into a differentiable objective that recovers original inputs more reliably than prior discrete or full-gradient-matching attacks.

What carries the argument

Embedding-subspace distance optimization that aligns candidate token embeddings with the low-rank signal extracted from attention gradients.

If this is right

On encoder-only models TIGER improves both reconstruction quality and runtime over existing attacks.
On decoder models TIGER succeeds in DP-defended federated learning where prior subspace attacks fail.
The continuous formulation avoids the instability of dummy-input optimization and the brittleness of discrete membership tests.
The approach scales to non-causal attention without requiring per-token discrete searches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The subspace signal may remain informative in other architectures that exhibit low-rank gradient structure.
Defenses could be strengthened by specifically disrupting the low-rank component of attention gradients.
Cumulative leakage across multiple rounds of federated updates could be quantified by repeated application of the same subspace optimization.

Load-bearing premise

The low-rank attention gradients span a subspace that contains the true token embeddings even after numerical perturbations from quantization or differential privacy.

What would settle it

An experiment in which TIGER fails to recover any correct tokens from DP-protected gradients on a decoder-only transformer where the subspace no longer contains the true embeddings.

Figures

Figures reproduced from arXiv: 2606.18312 by Dimitar I. Dimitrov, Ivo Petrov, Martin Vechev, William Kalikman.

**Figure 1.** Figure 1: Overview of TIGER. (a) The shared span objective: the span distance D(v, PS ) between a vector v and a subspace S with projection matrix PS , computed by each SPAN block. (b) Decoder attack: embeddings are recovered position by position (i = 1 . . . Tb) using the forward loss ℓforw only. Red marks attacker-simulated quantities. (c) Encoder attack: the simulated input Z¯0 is optimized so that at every layer… view at source ↗

**Figure 2.** Figure 2: Closed-book MCQ accuracy of GEMMA-3-4B-IT fine-tuned on FictionalQA. to achieve nonzero ROUGE-1 even in the B = 1 setting. We therefore do not include LAMP in the reported decoder tables. 5.2.2. DAGER. For the decoder setting, we adapt DAGER to the models used in our evaluation. We follow the original paper by setting the first-layer filtering threshold to 10−5 , but we relax the second-layer threshold to … view at source ↗

**Figure 3.** Figure 3: ROUGE-1 and mean per-attack wall-clock as a [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

read the original abstract

Federated learning allows multiple clients to jointly train a shared model by sending gradient updates to a central server while keeping raw inputs local. However, prior gradient inversion attacks show that these updates can reveal enough information to reconstruct client inputs. Existing attacks on transformers either optimize dummy inputs to match the true client updates, which is costly and unstable for modern models, or exploit the low rank of attention gradients to identify a subspace containing the true layer embeddings, followed by a discrete membership test for candidate tokens. However, this token test is brittle under numerical noise, i.e., from quantization or Differential Privacy (DP), and scales poorly for encoder models with non-causal attention. We introduce TIGER, a continuous gradient inversion attack that turns this subspace signal into a differentiable objective. Instead of searching over tokens or matching full gradients, TIGER directly optimizes token embeddings to minimize their distance to the subspace. Our experiments demonstrate that on encoder-only models, TIGER substantially improves both reconstruction quality and runtime over existing attacks, while on decoder models, TIGER is more robust than prior subspace-based attacks, enabling the first successful reconstructions in DP-defended federated learning settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TIGER replaces the brittle discrete token test with a differentiable embedding-to-subspace distance objective, which the abstract claims improves reconstruction on encoders and enables the first DP-robust results on decoders.

read the letter

The main takeaway is that TIGER turns the low-rank subspace signal from attention gradients into a continuous optimization problem instead of relying on discrete membership tests. This is the concrete technical step forward: they optimize token embeddings to minimize distance to the extracted subspace rather than matching full gradients or testing candidates one by one.

The paper does a reasonable job extending the subspace idea from prior work into something that scales better for encoder models and holds up under DP noise on decoders. The abstract positions the DP results as the first successful reconstructions in that setting, which would matter if the experiments hold. The approach avoids obvious fitting tricks and keeps the core objective parameter-free.

Soft spots are mostly about missing detail in the summary. No metrics, baselines, or ablation numbers appear here, so the size of the claimed gains in quality and runtime is hard to judge. The low-rank assumption is presented as reliable, but any real deployment would still need to check how sensitive the subspace extraction is to stronger DP noise or different model scales. Nothing in the description looks internally inconsistent.

This is for people working on gradient leakage and privacy defenses in federated transformer training. A reader already following subspace or inversion attacks would get the most out of the new objective and the DP experiments.

It deserves peer review. The shift to a differentiable objective is clear enough to be worth referee feedback even if the empirical claims need tightening.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces TIGER, a continuous gradient inversion attack for transformer models in federated learning settings. It exploits the low-rank structure of attention gradients to identify an embedding subspace and formulates a differentiable objective that optimizes token embeddings to minimize their distance to this subspace, avoiding both full gradient matching and brittle discrete membership tests. The central claim is that this yields substantially better reconstruction quality and runtime on encoder-only models and greater robustness on decoder models, including the first successful reconstructions under differential privacy noise.

Significance. If the experimental results hold, TIGER provides a more stable and scalable alternative to prior subspace-based and optimization-based inversion attacks on transformers. The shift to a continuous embedding-distance objective addresses a key brittleness under numerical perturbations such as quantization or DP, which is relevant for assessing privacy leakage in federated learning deployments of modern language models.

minor comments (2)

[Abstract] Abstract: the claim of 'substantially improves both reconstruction quality and runtime' and 'first successful reconstructions in DP-defended federated learning settings' is stated without any quantitative metrics, baseline names, or dataset sizes; the results section should be cross-referenced in the abstract or a table of key numbers added for immediate assessment.
The description of the embedding-subspace distance objective would benefit from an explicit equation or pseudocode block showing how the low-rank attention gradient matrix is used to construct the subspace projector and how the distance is computed during optimization.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation for minor revision. The provided summary accurately captures the core contribution of TIGER as a continuous embedding-subspace optimization approach for gradient inversion.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central contribution is TIGER, which converts an existing low-rank subspace signal (from prior subspace-based attacks) into a new differentiable embedding-distance optimization objective. This is presented as an independent algorithmic extension rather than a re-derivation or fit of prior results. No equations, predictions, or uniqueness claims in the provided abstract reduce by construction to fitted inputs, self-citations, or ansatzes; the method is described as building on but extending the subspace idea with continuous optimization that improves robustness under noise. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated beyond the domain assumption of low-rank attention gradients.

axioms (1)

domain assumption Attention gradients in transformers exhibit low rank, allowing identification of an embedding subspace containing true tokens.
Invoked to justify the subspace signal used by TIGER and prior attacks.

pith-pipeline@v0.9.1-grok · 5744 in / 1211 out tokens · 30917 ms · 2026-06-27T00:22:51.173770+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 21 canonical work pages · 3 internal anchors

[1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA, ser. Proceedings of Machine Learning Research, A. Singh and X. J...

2017
[2]

Fate-llm: A industrial grade federated learning framework for large language models.arXiv preprint arXiv:2310.10049, 2023

T. Fan, Y . Kang, G. Ma, W. Chen, W. Wei, L. Fan, and Q. Yang, “FATE-LLM: A industrial grade federated learning framework for large language models,”CoRR, vol. abs/2310.10049, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2310.10049

work page doi:10.48550/arxiv.2310.10049 2023
[3]

E., & Politis, I

C. Chen, X. Feng, Y . Li, L. Lyu, J. Zhou, X. Zheng, and J. Yin, “Integration of large language models and federated learning,” Patterns, vol. 5, no. 12, p. 101098, 2024. [Online]. Available: https://doi.org/10.1016/j.patter.2024.101098

work page doi:10.1016/j.patter.2024.101098 2024
[4]

Federated Large Language Models: Current Progress and Future Directions

Y . Yao, J. Zhang, J. Wu, C. Huang, Y . Xia, T. Yu, R. Zhang, S. Kim, R. A. Rossi, A. Li, L. Yao, J. J. McAuley, Y . Chen, and C. Joe-Wong, “Federated large language models: Current progress and future directions,”CoRR, vol. abs/2409.15723, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2409.15723

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2409.15723 2024
[5]

Federated large language models: Feasibility, robustness, security and future directions,

W. Jiang, Y . Luo, G. Deng, S. Chen, X. Yang, S. Wu, X. Gao, L. Liu, and S. Fu, “Federated large language models: Feasibility, robustness, security and future directions,”CoRR, vol. abs/2505.08830, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2505.08830

work page doi:10.48550/arxiv.2505.08830 2025
[6]

Federated multilingual models for medical transcript analysis,

A. Manoel, M. del Carmen Hipolito Garcia, T. Baumel, S. Su, J. Chen, R. Sim, D. Miller, D. Karmon, and D. Dimitriadis, “Federated multilingual models for medical transcript analysis,” inConference on Health, Inference, and Learning, CHIL 2023, Broad Institute of MIT and Harvard (Merkin Building), 415 Main Street, Cambridge, MA, USA, ser. Proceedings of Ma...

2023
[7]

Federated learning with layer skipping: Efficient training of large language models for healthcare NLP,

L. Zhang and Y . Li, “Federated learning with layer skipping: Efficient training of large language models for healthcare NLP,”CoRR, vol. abs/2504.10536, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2504.10536

work page doi:10.48550/arxiv.2504.10536 2025
[8]

A federated and parameter-efficient framework for large language model training in medicine,

A. Li, Y . Chen, W. Long, Y . Yin, Y . Hu, H. Kim, W. Zhou, Y . Zhou, H. Peng, Y . Ren, X. Ai, Z. Qin, M. Hu, X. Li, H. Yu, Y . Tham, L. Ohno-Machado, H. Xu, and Q. Chen, “A federated and parameter-efficient framework for large language model training in medicine,”CoRR, vol. abs/2601.22124, 2026. [Online]. Available: https://doi.org/10.48550/arXiv.2601.22124

work page doi:10.48550/arxiv.2601.22124 2026
[9]

Privacy-first health research with federated learning,

A. Sadilek, L. Liu, D. Nguyen, M. Kamruzzaman, S. Serghiou, B. Rader, A. Ingerman, S. Mellem, P. Kairouz, E. O. Nsoesie, J. Macfarlane, A. Vullikanti, M. V . Marathe, P. Eastham, J. S. Brownstein, B. A. y Arcas, M. D. Howell, and J. Hernandez, “Privacy-first health research with federated learning,”npj Digit. Medicine, vol. 4, 2021. [Online]. Available: h...

2021
[10]

FEDLEGAL: the first real-world federated learning benchmark for legal NLP,

Z. Zhang, X. Hu, J. Zhang, Y . Zhang, H. Wang, L. Qu, and Z. Xu, “FEDLEGAL: the first real-world federated learning benchmark for legal NLP,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, A. Rogers, J. L. Boyd-Graber, and N. Okazaki, Eds. Assoc...

work page doi:10.18653/v1/2023.acl-long.193 2023
[11]

Fedjudge: Federated legal large language model,

L. Yue, Q. Liu, Y . Du, W. Gao, Y . Liu, and F. Yao, “Fedjudge: Federated legal large language model,” inDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Gifu, Japan, July 2-5, 2024, Proceedings, Part V, ser. Lecture Notes in Computer Science, M. Onizuka, J. Lee, Y . Tong, C. Xiao, Y . Ishikawa, S. Amer-Yahia, H. V ...

work page doi:10.1007/978-981-97-5569-1_17 2024
[12]

Flowertune: A cross-domain benchmark for federated fine-tuning of large language models,

Y . Gao, M. R. Scamarcia, J. Fernández-Marqués, M. Naseri, C. S. Ng, D. Stripelis, Z. Li, T. Shen, J. Bai, D. Chen, Z. Zhang, R. Hu, I. Song, K. Lee, H. Jia, T. Dang, J. Wang, Z. Liu, D. J. Beutel, L. Lyu, and N. D. Lane, “Flowertune: A cross-domain benchmark for federated fine-tuning of large language models,”CoRR, vol. abs/2506.02961,

arXiv
[13]

Available: https://doi.org/10.48550/arXiv.2506.02961

[Online]. Available: https://doi.org/10.48550/arXiv.2506.02961

work page doi:10.48550/arxiv.2506.02961
[14]

2021 , isbn =

D. Byrd and A. Polychroniadou, “Differentially private secure multi- party computation for federated learning in financial applications,” inICAIF ’20: The First ACM International Conference on AI in Finance, New York, NY, USA, October 15-16, 2020, T. Balch, Ed. ACM, 2020, pp. 16:1–16:9. [Online]. Available: https://doi.org/10.1145/3383455.3422562

work page doi:10.1145/3383455.3422562 2020
[15]

Deep leakage from gradients,

L. Zhu, Z. Liu, and S. Han, “Deep leakage from gradients,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. B. Fox, and R. Garnett, Eds., 2019, pp. 14 747–14 756...

2019
[16]

idlg: Improved deep leakage from gradients,

B. Zhao, K. R. Mopuri, and H. Bilen, “idlg: Improved deep leakage from gradients,”CoRR, vol. abs/2001.02610, 2020. [Online]. Available: http://arxiv.org/abs/2001.02610

arXiv 2001
[17]

Inverting gradients - how easy is it to break privacy in federated learning?

J. Geiping, H. Bauermeister, H. Dröge, and M. Moeller, “Inverting gradients - how easy is it to break privacy in federated learning?” inAdvances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. L...

2020
[18]

See through gradients: Image batch recovery via gradinversion,

H. Yin, A. Mallya, A. Vahdat, J. M. Álvarez, J. Kautz, and P. Molchanov, “See through gradients: Image batch recovery via gradinversion,” inIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 2021, pp. 16 337–16 346. [Online]. Available: https://openaccess.thecvf.com/content/...

2021
[19]

Gradient inversion with generative image prior,

J. Jeon, J. Kim, K. Lee, S. Oh, and J. Ok, “Gradient inversion with generative image prior,” inAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, M. Ranzato, A. Beygelzimer, Y . N. Dauphin, P. Liang, and J. W. Vaughan, Eds., 2021, pp. 29 898–29 ...

2021
[20]

A ConvNet for the 2020s

A. Hatamizadeh, H. Yin, H. Roth, W. Li, J. Kautz, D. Xu, and P. Molchanov, “Gradvit: Gradient inversion of vision transformers,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18- 24, 2022. IEEE, 2022, pp. 10 011–10 020. [Online]. Available: https://doi.org/10.1109/CVPR52688.2022.00978

work page doi:10.1109/cvpr52688.2022.00978 2022
[21]

Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis,

S. Kariyappa, C. Guo, K. Maeng, W. Xiong, G. E. Suh, M. K. Qureshi, and H. S. Lee, “Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis,” inInternational Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, ser. Proceedings of Machine Learning Research, A. Kraus...

2023
[22]

TAG: gradient attack on transformer-based language models,

J. Deng, Y . Wang, J. Li, C. Wang, C. Shang, H. Liu, S. Rajasekaran, and C. Ding, “TAG: gradient attack on transformer-based language models,” inFindings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021, ser. Findings of ACL, M. Moens, X. Huang, L. Specia, and S. W. Yih, Eds...

work page doi:10.18653/v1/2021.findings-emnlp.305 2021
[23]

LAMP: extracting text from gradients with language model priors,

M. Balunovic, D. I. Dimitrov, N. Jovanovic, and M. T. Vechev, “LAMP: extracting text from gradients with language model priors,” inAdvances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, S. Koyejo, S. Mohamed, A. Agarwal, D. B...

2022
[24]

Recovering private text in federated learning of language models,

S. Gupta, Y . Huang, Z. Zhong, T. Gao, K. Li, and D. Chen, “Recovering private text in federated learning of language models,” inAdvances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, S. Koyejo, S. Mohamed, A. Agarwal, D. Bel...

2022
[25]

Available: http://papers.nips.cc/paper_files/paper/2022/ hash/35b5c175e139bff5f22a5361270fce87-Abstract-Conference.html

[Online]. Available: http://papers.nips.cc/paper_files/paper/2022/ hash/35b5c175e139bff5f22a5361270fce87-Abstract-Conference.html

2022
[26]

Beyond gradient and priors in privacy attacks: Leveraging pooler layer inputs of language models in federated learning,

J. Li, S. Liu, and Q. Lei, “Beyond gradient and priors in privacy attacks: Leveraging pooler layer inputs of language models in federated learning,”CoRR, vol. abs/2312.05720, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2312.05720

work page doi:10.48550/arxiv.2312.05720 2023
[27]

DAGER: exact gradient inversion for large language models,

I. Petrov, D. I. Dimitrov, M. Baader, M. N. Müller, and M. T. Vechev, “DAGER: exact gradient inversion for large language models,” inAdvances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, A. Globersons, L. Mackey, D. Belgrave, A. Fa...

2024
[28]

Gemma 3,

Gemma Team, “Gemma 3,” 2025. [Online]. Available: https: //arxiv.org/abs/2503.19786

Pith/arXiv arXiv 2025
[29]

Embeddinggemma: Powerful and lightweight text representations,

H. Schechter Vera, S. Dua, and EmbeddingGemma Team, “Embeddinggemma: Powerful and lightweight text representations,”
[30]

Available: https://arxiv.org/abs/2509.20354

[Online]. Available: https://arxiv.org/abs/2509.20354

Pith/arXiv arXiv
[31]

MedGemma Technical Report

Google Research and Google DeepMind, “Medgemma technical report,”CoRR, vol. abs/2507.05201, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2507.05201

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.05201 2025
[32]

MedGemma 1.5 Technical Report

A. Sellergren, C. Gao, F. Mahvar, T. Kohlberger, F. Jamil, M. Traverse, A. Tono, B. Sadjad, L. Yang, C. Lau, L. Yatziv, T. L. Chen, B. Sterling, K. Philbrick, R. Tiwari, Y . Liu, M. Jajoo, C. Sankarapu, S. Vispute, H. Purandare, A. B. Mishra, S. Schmidgall, T. Tu, A. Palepu, C. Park, T. Strother, R. Thapa, Y . Cheng, P. Singh, K. Black, Y . Matias, K. Cho...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.05081 2026
[33]

How small can you go? lora fine-tuning 270m-8b models for merchant information extraction in financial transactions,

D. Huang, T. Drietomsky, B. Barrett, and Z. Wang, “How small can you go? lora fine-tuning 270m-8b models for merchant information extraction in financial transactions,” 2026. [Online]. Available: https://arxiv.org/abs/2606.08051

Pith/arXiv arXiv 2026
[34]

An extended annotation scheme for personal-fact classification in dialogue,

K. Zaitsev, “An extended annotation scheme for personal-fact classification in dialogue,” 2026, model: https://huggingface.co/adugeen/personal- facts-classifier-embeddinggemma-300m; Dataset: https://huggingface.co/datasets/adugeen/personal-facts-msc

2026
[35]

Pointer sentinel mixture models,

S. Merity, C. Xiong, J. Bradbury, and R. Socher, “Pointer sentinel mixture models,” 2016

2016
[36]

ROUGE: A package for automatic evaluation of summaries,

C.-Y . Lin, “ROUGE: A package for automatic evaluation of summaries,” inText Summarization Branches Out. Barcelona, Spain: Association for Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https://aclanthology.org/W04-1013/

2004
[37]

R-GAP: recursive gradient attack on privacy,

J. Zhu and M. B. Blaschko, “R-GAP: recursive gradient attack on privacy,” in9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3- 7, 2021. OpenReview.net, 2021. [Online]. Available: https: //openreview.net/forum?id=RSU17UoKfJF

2021
[38]

Towards general deep leakage in federated learning,

J. Geng, Y . Mou, F. Li, Q. Li, O. Beyan, S. Decker, and C. Rong, “Towards general deep leakage in federated learning,”CoRR, vol. abs/2110.09074, 2021. [Online]. Available: https://arxiv.org/abs/2110.09074

arXiv 2021
[39]

SPEAR: exact gradient inversion of batches in federated learning,

D. I. Dimitrov, M. Baader, M. N. Müller, and M. T. Vechev, “SPEAR: exact gradient inversion of batches in federated learning,” inAdvances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, A. Globersons, L. Mackey, D. Belgrave, A. Fan, U...

2024
[40]

SPEAR++: scaling gradient inversion via sparsely-used dictionary learning,

A. Bakarsky, D. I. Dimitrov, M. Baader, and M. T. Vechev, “SPEAR++: scaling gradient inversion via sparsely-used dictionary learning,”CoRR, vol. abs/2510.24200, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2510.24200

work page doi:10.48550/arxiv.2510.24200 2025
[41]

Robbing the fed: Directly obtaining private data in federated learning with modified models,

L. H. Fowl, J. Geiping, W. Czaja, M. Goldblum, and T. Goldstein, “Robbing the fed: Directly obtaining private data in federated learning with modified models,” inThe Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [Online]. Available: https://openreview.net/forum?id=fwzUgo0FM9v

2022
[42]

Fishing for user data in large-batch federated learning via gradient magnification,

Y . Wen, J. Geiping, L. Fowl, M. Goldblum, and T. Goldstein, “Fishing for user data in large-batch federated learning via gradient magnification,” inInternational Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvári, G. Niu, and...

2022
[43]

Decepticons: Corrupted transformers breach privacy in federated learning for language models,

L. H. Fowl, J. Geiping, S. Reich, Y . Wen, W. Czaja, M. Goldblum, and T. Goldstein, “Decepticons: Corrupted transformers breach privacy in federated learning for language models,” inThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. [Online]. Available: https://openreview.net/...

2023
[44]

Panning for gold in federated learning: Targeted text extraction under arbitrarily large-scale aggregation,

H. Chu, J. Geiping, L. H. Fowl, M. Goldblum, and T. Goldstein, “Panning for gold in federated learning: Targeted text extraction under arbitrarily large-scale aggregation,” inThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. [Online]. Available: https://openreview.net/forum?i...

2023
[45]

Minegrad: Gradient inversion attacks on loRA fine-tuning,

H. U. Sami, S. Sen, and B. Guler, “Minegrad: Gradient inversion attacks on loRA fine-tuning,” inThe 29th International Conference on Artificial Intelligence and Statistics, 2026. [Online]. Available: https://openreview.net/forum?id=dD9XOZUpNc

2026
[46]

A ConvNet for the 2020s

J. Lu, X. S. Zhang, T. Zhao, X. He, and J. Cheng, “APRIL: finding the achilles’ heel on privacy for vision transformers,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18- 24, 2022. IEEE, 2022, pp. 10 041–10 050. [Online]. Available: https://doi.org/10.1109/CVPR52688.2022.00981

work page doi:10.1109/cvpr52688.2022.00981 2022
[47]

GRAIN: exact graph reconstruction from gradients,

M. Drencheva, I. Petrov, M. Baader, D. I. Dimitrov, and M. T. Vechev, “GRAIN: exact graph reconstruction from gradients,” inThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [Online]. Available: https://openreview.net/forum?id=7bAjVh3CG3

2025
[48]

B., Mironov, I., Talwar, K., and Zhang, L

M. Abadi, A. Chu, I. J. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” inProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016, E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and S. Halevi, Eds. ACM, 2016, pp. 308–318...

work page doi:10.1145/2976749.2978318 2016
[49]

A fictional q&a dataset for studying memorization and knowledge acquisition,

J. Kirchenbauer, J. Mongkolsupawan, Y . Wen, T. Goldstein, and D. Ippolito, “A fictional q&a dataset for studying memorization and knowledge acquisition,”CoRR, vol. abs/2506.05639, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2506.05639

work page doi:10.48550/arxiv.2506.05639 2025
[50]

Learning word vectors for sentiment analysis,

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y . Ng, and C. Potts, “Learning word vectors for sentiment analysis,” inProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http://...

2011
[52]

Available: http://arxiv.org/abs/1712.07557

[Online]. Available: http://arxiv.org/abs/1712.07557

Pith/arXiv arXiv
[53]

Differentially private learning with adaptive clipping,

G. Andrew, O. Thakkar, B. McMahan, and S. Ramaswamy, “Differentially private learning with adaptive clipping,” inAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, M. Ranzato, A. Beygelzimer, Y . N. Dauphin, P. Liang, and J. W. Vaughan, Eds., 20...

2021
[54]

Differentially private learning with per-sample adaptive clipping,

T. Xia, S. Shen, S. Yao, X. Fu, K. Xu, X. Xu, and X. Fu, “Differentially private learning with per-sample adaptive clipping,” inThirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligen...

work page doi:10.1609/aaai.v37i9.26242 2023
[55]

Federated learning: Strategies for improving communication efficiency,

J. Kone ˇcný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency,”CoRR, vol. abs/1610.05492, 2016. [Online]. Available: http://arxiv.org/abs/1610.05492

Pith/arXiv arXiv 2016
[56]

Brendan and Patel, Sarvar and Ramage, Daniel and Segal, Aaron and Seth, Karn , title =

K. A. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, B. Thuraisingham, D. Ev...

work page doi:10.1145/3133956.3133982 2017
[57]

Deep gradient compression: Reducing the communication bandwidth for distributed training,

Y . Lin, S. Han, H. Mao, Y . Wang, and B. Dally, “Deep gradient compression: Reducing the communication bandwidth for distributed training,” in6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [Online]. Available: https://openreview.net/fo...

2018

[1] [1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA, ser. Proceedings of Machine Learning Research, A. Singh and X. J...

2017

[2] [2]

Fate-llm: A industrial grade federated learning framework for large language models.arXiv preprint arXiv:2310.10049, 2023

T. Fan, Y . Kang, G. Ma, W. Chen, W. Wei, L. Fan, and Q. Yang, “FATE-LLM: A industrial grade federated learning framework for large language models,”CoRR, vol. abs/2310.10049, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2310.10049

work page doi:10.48550/arxiv.2310.10049 2023

[3] [3]

E., & Politis, I

C. Chen, X. Feng, Y . Li, L. Lyu, J. Zhou, X. Zheng, and J. Yin, “Integration of large language models and federated learning,” Patterns, vol. 5, no. 12, p. 101098, 2024. [Online]. Available: https://doi.org/10.1016/j.patter.2024.101098

work page doi:10.1016/j.patter.2024.101098 2024

[4] [4]

Federated Large Language Models: Current Progress and Future Directions

Y . Yao, J. Zhang, J. Wu, C. Huang, Y . Xia, T. Yu, R. Zhang, S. Kim, R. A. Rossi, A. Li, L. Yao, J. J. McAuley, Y . Chen, and C. Joe-Wong, “Federated large language models: Current progress and future directions,”CoRR, vol. abs/2409.15723, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2409.15723

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2409.15723 2024

[5] [5]

Federated large language models: Feasibility, robustness, security and future directions,

W. Jiang, Y . Luo, G. Deng, S. Chen, X. Yang, S. Wu, X. Gao, L. Liu, and S. Fu, “Federated large language models: Feasibility, robustness, security and future directions,”CoRR, vol. abs/2505.08830, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2505.08830

work page doi:10.48550/arxiv.2505.08830 2025

[6] [6]

Federated multilingual models for medical transcript analysis,

A. Manoel, M. del Carmen Hipolito Garcia, T. Baumel, S. Su, J. Chen, R. Sim, D. Miller, D. Karmon, and D. Dimitriadis, “Federated multilingual models for medical transcript analysis,” inConference on Health, Inference, and Learning, CHIL 2023, Broad Institute of MIT and Harvard (Merkin Building), 415 Main Street, Cambridge, MA, USA, ser. Proceedings of Ma...

2023

[7] [7]

Federated learning with layer skipping: Efficient training of large language models for healthcare NLP,

L. Zhang and Y . Li, “Federated learning with layer skipping: Efficient training of large language models for healthcare NLP,”CoRR, vol. abs/2504.10536, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2504.10536

work page doi:10.48550/arxiv.2504.10536 2025

[8] [8]

A federated and parameter-efficient framework for large language model training in medicine,

A. Li, Y . Chen, W. Long, Y . Yin, Y . Hu, H. Kim, W. Zhou, Y . Zhou, H. Peng, Y . Ren, X. Ai, Z. Qin, M. Hu, X. Li, H. Yu, Y . Tham, L. Ohno-Machado, H. Xu, and Q. Chen, “A federated and parameter-efficient framework for large language model training in medicine,”CoRR, vol. abs/2601.22124, 2026. [Online]. Available: https://doi.org/10.48550/arXiv.2601.22124

work page doi:10.48550/arxiv.2601.22124 2026

[9] [9]

Privacy-first health research with federated learning,

A. Sadilek, L. Liu, D. Nguyen, M. Kamruzzaman, S. Serghiou, B. Rader, A. Ingerman, S. Mellem, P. Kairouz, E. O. Nsoesie, J. Macfarlane, A. Vullikanti, M. V . Marathe, P. Eastham, J. S. Brownstein, B. A. y Arcas, M. D. Howell, and J. Hernandez, “Privacy-first health research with federated learning,”npj Digit. Medicine, vol. 4, 2021. [Online]. Available: h...

2021

[10] [10]

FEDLEGAL: the first real-world federated learning benchmark for legal NLP,

Z. Zhang, X. Hu, J. Zhang, Y . Zhang, H. Wang, L. Qu, and Z. Xu, “FEDLEGAL: the first real-world federated learning benchmark for legal NLP,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, A. Rogers, J. L. Boyd-Graber, and N. Okazaki, Eds. Assoc...

work page doi:10.18653/v1/2023.acl-long.193 2023

[11] [11]

Fedjudge: Federated legal large language model,

L. Yue, Q. Liu, Y . Du, W. Gao, Y . Liu, and F. Yao, “Fedjudge: Federated legal large language model,” inDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Gifu, Japan, July 2-5, 2024, Proceedings, Part V, ser. Lecture Notes in Computer Science, M. Onizuka, J. Lee, Y . Tong, C. Xiao, Y . Ishikawa, S. Amer-Yahia, H. V ...

work page doi:10.1007/978-981-97-5569-1_17 2024

[12] [12]

Flowertune: A cross-domain benchmark for federated fine-tuning of large language models,

Y . Gao, M. R. Scamarcia, J. Fernández-Marqués, M. Naseri, C. S. Ng, D. Stripelis, Z. Li, T. Shen, J. Bai, D. Chen, Z. Zhang, R. Hu, I. Song, K. Lee, H. Jia, T. Dang, J. Wang, Z. Liu, D. J. Beutel, L. Lyu, and N. D. Lane, “Flowertune: A cross-domain benchmark for federated fine-tuning of large language models,”CoRR, vol. abs/2506.02961,

arXiv

[13] [13]

Available: https://doi.org/10.48550/arXiv.2506.02961

[Online]. Available: https://doi.org/10.48550/arXiv.2506.02961

work page doi:10.48550/arxiv.2506.02961

[14] [14]

2021 , isbn =

D. Byrd and A. Polychroniadou, “Differentially private secure multi- party computation for federated learning in financial applications,” inICAIF ’20: The First ACM International Conference on AI in Finance, New York, NY, USA, October 15-16, 2020, T. Balch, Ed. ACM, 2020, pp. 16:1–16:9. [Online]. Available: https://doi.org/10.1145/3383455.3422562

work page doi:10.1145/3383455.3422562 2020

[15] [15]

Deep leakage from gradients,

L. Zhu, Z. Liu, and S. Han, “Deep leakage from gradients,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. B. Fox, and R. Garnett, Eds., 2019, pp. 14 747–14 756...

2019

[16] [16]

idlg: Improved deep leakage from gradients,

B. Zhao, K. R. Mopuri, and H. Bilen, “idlg: Improved deep leakage from gradients,”CoRR, vol. abs/2001.02610, 2020. [Online]. Available: http://arxiv.org/abs/2001.02610

arXiv 2001

[17] [17]

Inverting gradients - how easy is it to break privacy in federated learning?

J. Geiping, H. Bauermeister, H. Dröge, and M. Moeller, “Inverting gradients - how easy is it to break privacy in federated learning?” inAdvances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. L...

2020

[18] [18]

See through gradients: Image batch recovery via gradinversion,

H. Yin, A. Mallya, A. Vahdat, J. M. Álvarez, J. Kautz, and P. Molchanov, “See through gradients: Image batch recovery via gradinversion,” inIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 2021, pp. 16 337–16 346. [Online]. Available: https://openaccess.thecvf.com/content/...

2021

[19] [19]

Gradient inversion with generative image prior,

J. Jeon, J. Kim, K. Lee, S. Oh, and J. Ok, “Gradient inversion with generative image prior,” inAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, M. Ranzato, A. Beygelzimer, Y . N. Dauphin, P. Liang, and J. W. Vaughan, Eds., 2021, pp. 29 898–29 ...

2021

[20] [20]

A ConvNet for the 2020s

A. Hatamizadeh, H. Yin, H. Roth, W. Li, J. Kautz, D. Xu, and P. Molchanov, “Gradvit: Gradient inversion of vision transformers,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18- 24, 2022. IEEE, 2022, pp. 10 011–10 020. [Online]. Available: https://doi.org/10.1109/CVPR52688.2022.00978

work page doi:10.1109/cvpr52688.2022.00978 2022

[21] [21]

Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis,

S. Kariyappa, C. Guo, K. Maeng, W. Xiong, G. E. Suh, M. K. Qureshi, and H. S. Lee, “Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis,” inInternational Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, ser. Proceedings of Machine Learning Research, A. Kraus...

2023

[22] [22]

TAG: gradient attack on transformer-based language models,

J. Deng, Y . Wang, J. Li, C. Wang, C. Shang, H. Liu, S. Rajasekaran, and C. Ding, “TAG: gradient attack on transformer-based language models,” inFindings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021, ser. Findings of ACL, M. Moens, X. Huang, L. Specia, and S. W. Yih, Eds...

work page doi:10.18653/v1/2021.findings-emnlp.305 2021

[23] [23]

LAMP: extracting text from gradients with language model priors,

M. Balunovic, D. I. Dimitrov, N. Jovanovic, and M. T. Vechev, “LAMP: extracting text from gradients with language model priors,” inAdvances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, S. Koyejo, S. Mohamed, A. Agarwal, D. B...

2022

[24] [24]

Recovering private text in federated learning of language models,

S. Gupta, Y . Huang, Z. Zhong, T. Gao, K. Li, and D. Chen, “Recovering private text in federated learning of language models,” inAdvances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, S. Koyejo, S. Mohamed, A. Agarwal, D. Bel...

2022

[25] [25]

Available: http://papers.nips.cc/paper_files/paper/2022/ hash/35b5c175e139bff5f22a5361270fce87-Abstract-Conference.html

[Online]. Available: http://papers.nips.cc/paper_files/paper/2022/ hash/35b5c175e139bff5f22a5361270fce87-Abstract-Conference.html

2022

[26] [26]

Beyond gradient and priors in privacy attacks: Leveraging pooler layer inputs of language models in federated learning,

J. Li, S. Liu, and Q. Lei, “Beyond gradient and priors in privacy attacks: Leveraging pooler layer inputs of language models in federated learning,”CoRR, vol. abs/2312.05720, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2312.05720

work page doi:10.48550/arxiv.2312.05720 2023

[27] [27]

DAGER: exact gradient inversion for large language models,

I. Petrov, D. I. Dimitrov, M. Baader, M. N. Müller, and M. T. Vechev, “DAGER: exact gradient inversion for large language models,” inAdvances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, A. Globersons, L. Mackey, D. Belgrave, A. Fa...

2024

[28] [28]

Gemma 3,

Gemma Team, “Gemma 3,” 2025. [Online]. Available: https: //arxiv.org/abs/2503.19786

Pith/arXiv arXiv 2025

[29] [29]

Embeddinggemma: Powerful and lightweight text representations,

H. Schechter Vera, S. Dua, and EmbeddingGemma Team, “Embeddinggemma: Powerful and lightweight text representations,”

[30] [30]

Available: https://arxiv.org/abs/2509.20354

[Online]. Available: https://arxiv.org/abs/2509.20354

Pith/arXiv arXiv

[31] [31]

MedGemma Technical Report

Google Research and Google DeepMind, “Medgemma technical report,”CoRR, vol. abs/2507.05201, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2507.05201

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.05201 2025

[32] [32]

MedGemma 1.5 Technical Report

A. Sellergren, C. Gao, F. Mahvar, T. Kohlberger, F. Jamil, M. Traverse, A. Tono, B. Sadjad, L. Yang, C. Lau, L. Yatziv, T. L. Chen, B. Sterling, K. Philbrick, R. Tiwari, Y . Liu, M. Jajoo, C. Sankarapu, S. Vispute, H. Purandare, A. B. Mishra, S. Schmidgall, T. Tu, A. Palepu, C. Park, T. Strother, R. Thapa, Y . Cheng, P. Singh, K. Black, Y . Matias, K. Cho...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.05081 2026

[33] [33]

How small can you go? lora fine-tuning 270m-8b models for merchant information extraction in financial transactions,

D. Huang, T. Drietomsky, B. Barrett, and Z. Wang, “How small can you go? lora fine-tuning 270m-8b models for merchant information extraction in financial transactions,” 2026. [Online]. Available: https://arxiv.org/abs/2606.08051

Pith/arXiv arXiv 2026

[34] [34]

An extended annotation scheme for personal-fact classification in dialogue,

K. Zaitsev, “An extended annotation scheme for personal-fact classification in dialogue,” 2026, model: https://huggingface.co/adugeen/personal- facts-classifier-embeddinggemma-300m; Dataset: https://huggingface.co/datasets/adugeen/personal-facts-msc

2026

[35] [35]

Pointer sentinel mixture models,

S. Merity, C. Xiong, J. Bradbury, and R. Socher, “Pointer sentinel mixture models,” 2016

2016

[36] [36]

ROUGE: A package for automatic evaluation of summaries,

C.-Y . Lin, “ROUGE: A package for automatic evaluation of summaries,” inText Summarization Branches Out. Barcelona, Spain: Association for Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https://aclanthology.org/W04-1013/

2004

[37] [37]

R-GAP: recursive gradient attack on privacy,

J. Zhu and M. B. Blaschko, “R-GAP: recursive gradient attack on privacy,” in9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3- 7, 2021. OpenReview.net, 2021. [Online]. Available: https: //openreview.net/forum?id=RSU17UoKfJF

2021

[38] [38]

Towards general deep leakage in federated learning,

J. Geng, Y . Mou, F. Li, Q. Li, O. Beyan, S. Decker, and C. Rong, “Towards general deep leakage in federated learning,”CoRR, vol. abs/2110.09074, 2021. [Online]. Available: https://arxiv.org/abs/2110.09074

arXiv 2021

[39] [39]

SPEAR: exact gradient inversion of batches in federated learning,

D. I. Dimitrov, M. Baader, M. N. Müller, and M. T. Vechev, “SPEAR: exact gradient inversion of batches in federated learning,” inAdvances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, A. Globersons, L. Mackey, D. Belgrave, A. Fan, U...

2024

[40] [40]

SPEAR++: scaling gradient inversion via sparsely-used dictionary learning,

A. Bakarsky, D. I. Dimitrov, M. Baader, and M. T. Vechev, “SPEAR++: scaling gradient inversion via sparsely-used dictionary learning,”CoRR, vol. abs/2510.24200, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2510.24200

work page doi:10.48550/arxiv.2510.24200 2025

[41] [41]

Robbing the fed: Directly obtaining private data in federated learning with modified models,

L. H. Fowl, J. Geiping, W. Czaja, M. Goldblum, and T. Goldstein, “Robbing the fed: Directly obtaining private data in federated learning with modified models,” inThe Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [Online]. Available: https://openreview.net/forum?id=fwzUgo0FM9v

2022

[42] [42]

Fishing for user data in large-batch federated learning via gradient magnification,

Y . Wen, J. Geiping, L. Fowl, M. Goldblum, and T. Goldstein, “Fishing for user data in large-batch federated learning via gradient magnification,” inInternational Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvári, G. Niu, and...

2022

[43] [43]

Decepticons: Corrupted transformers breach privacy in federated learning for language models,

L. H. Fowl, J. Geiping, S. Reich, Y . Wen, W. Czaja, M. Goldblum, and T. Goldstein, “Decepticons: Corrupted transformers breach privacy in federated learning for language models,” inThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. [Online]. Available: https://openreview.net/...

2023

[44] [44]

Panning for gold in federated learning: Targeted text extraction under arbitrarily large-scale aggregation,

H. Chu, J. Geiping, L. H. Fowl, M. Goldblum, and T. Goldstein, “Panning for gold in federated learning: Targeted text extraction under arbitrarily large-scale aggregation,” inThe Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. [Online]. Available: https://openreview.net/forum?i...

2023

[45] [45]

Minegrad: Gradient inversion attacks on loRA fine-tuning,

H. U. Sami, S. Sen, and B. Guler, “Minegrad: Gradient inversion attacks on loRA fine-tuning,” inThe 29th International Conference on Artificial Intelligence and Statistics, 2026. [Online]. Available: https://openreview.net/forum?id=dD9XOZUpNc

2026

[46] [46]

A ConvNet for the 2020s

J. Lu, X. S. Zhang, T. Zhao, X. He, and J. Cheng, “APRIL: finding the achilles’ heel on privacy for vision transformers,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18- 24, 2022. IEEE, 2022, pp. 10 041–10 050. [Online]. Available: https://doi.org/10.1109/CVPR52688.2022.00981

work page doi:10.1109/cvpr52688.2022.00981 2022

[47] [47]

GRAIN: exact graph reconstruction from gradients,

M. Drencheva, I. Petrov, M. Baader, D. I. Dimitrov, and M. T. Vechev, “GRAIN: exact graph reconstruction from gradients,” inThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [Online]. Available: https://openreview.net/forum?id=7bAjVh3CG3

2025

[48] [48]

B., Mironov, I., Talwar, K., and Zhang, L

M. Abadi, A. Chu, I. J. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” inProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016, E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and S. Halevi, Eds. ACM, 2016, pp. 308–318...

work page doi:10.1145/2976749.2978318 2016

[49] [49]

A fictional q&a dataset for studying memorization and knowledge acquisition,

J. Kirchenbauer, J. Mongkolsupawan, Y . Wen, T. Goldstein, and D. Ippolito, “A fictional q&a dataset for studying memorization and knowledge acquisition,”CoRR, vol. abs/2506.05639, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2506.05639

work page doi:10.48550/arxiv.2506.05639 2025

[50] [50]

Learning word vectors for sentiment analysis,

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y . Ng, and C. Potts, “Learning word vectors for sentiment analysis,” inProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http://...

2011

[51] [52]

Available: http://arxiv.org/abs/1712.07557

[Online]. Available: http://arxiv.org/abs/1712.07557

Pith/arXiv arXiv

[52] [53]

Differentially private learning with adaptive clipping,

G. Andrew, O. Thakkar, B. McMahan, and S. Ramaswamy, “Differentially private learning with adaptive clipping,” inAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, M. Ranzato, A. Beygelzimer, Y . N. Dauphin, P. Liang, and J. W. Vaughan, Eds., 20...

2021

[53] [54]

Differentially private learning with per-sample adaptive clipping,

T. Xia, S. Shen, S. Yao, X. Fu, K. Xu, X. Xu, and X. Fu, “Differentially private learning with per-sample adaptive clipping,” inThirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligen...

work page doi:10.1609/aaai.v37i9.26242 2023

[54] [55]

Federated learning: Strategies for improving communication efficiency,

J. Kone ˇcný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency,”CoRR, vol. abs/1610.05492, 2016. [Online]. Available: http://arxiv.org/abs/1610.05492

Pith/arXiv arXiv 2016

[55] [56]

Brendan and Patel, Sarvar and Ramage, Daniel and Segal, Aaron and Seth, Karn , title =

K. A. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, B. Thuraisingham, D. Ev...

work page doi:10.1145/3133956.3133982 2017

[56] [57]

Deep gradient compression: Reducing the communication bandwidth for distributed training,

Y . Lin, S. Han, H. Mao, Y . Wang, and B. Dally, “Deep gradient compression: Reducing the communication bandwidth for distributed training,” in6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [Online]. Available: https://openreview.net/fo...

2018