Embedding Inference Attack
Pith reviewed 2026-07-03 20:38 UTC · model grok-4.3
The pith
Tailored queries can identify which embedding model powers a black-box IR system from the unordered sets of returned documents alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the black-box setting limited to unordered retrieved document sets, queries can be constructed whose returned document collections differ in ways that uniquely fingerprint the underlying embedding model among known candidates; the same queries remain effective when a reranker is present, and they also bypass LLM safeguards in a RAG system.
What carries the argument
Embedding inference attack via tailored queries whose returned document sets serve as distinctive fingerprints across candidate models.
If this is right
- Once the embedding model is identified, the adversary can apply model-specific follow-on attacks such as embedding inversion.
- Rerankers alone do not prevent model identification from document-set observations.
- Similarity-threshold defenses can be tested as a countermeasure to reduce query effectiveness.
- The attack extends to RAG systems where the same queries evade LLM input filters.
Where Pith is reading between the lines
- Production IR APIs may need query monitoring or output randomization to prevent systematic model fingerprinting.
- Similar fingerprinting could be attempted against other hidden components such as the reranker itself.
- The closed-candidate assumption could be relaxed by first clustering models from public repositories before launching the attack.
Load-bearing premise
The attacker knows a closed list of candidate embedding models and the crafted queries produce document sets that are sufficiently different for each model.
What would settle it
A set of queries that, when issued to each candidate model, produce identical or statistically indistinguishable unordered document sets would falsify the attack's ability to discriminate models.
Figures
read the original abstract
Embedding models are essential components of modern Information Retrieval (IR) systems, yet they are typically hidden behind APIs. Recent works have shown that dense IR system can lead to security vulnerabilities such as embedding inversion attacks. However, such attacks usually require that the attacker knows the embedding model for the attack to be applicable. In this paper, we study IR systems under a black-box setting in which the adversary observes only the unordered set of retrieved documents, without ranking or similarity scores. We demonstrate that in such contexts, tailored queries allow an adversary to identify which embedding model is in use from a set of known model candidate, which we coin as an embedding inference attack (EIA). We also show that certain queries remain discriminative even when the system includes a reranker as a potential defense mechanism. We further validate our method on a real Retrieval-Augmented Generation (RAG) system, in which the tailored queries bypass the LLM's tendency to reject inputs it does not recognize as well-formed questions. Finally, we propose and evaluate other mitigation strategies such as similarity thresholds.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to demonstrate an embedding inference attack (EIA) in which tailored queries enable an adversary to identify the embedding model in use within a black-box IR system, based solely on the unordered set of retrieved documents. The attack is asserted to remain effective even when a reranker is present as a defense, is validated on a real RAG system (where queries bypass LLM rejection of non-question inputs), and the authors propose and evaluate mitigations such as similarity thresholds.
Significance. If the empirical demonstration holds with high identification accuracy, the result is significant because it identifies a new, realistic side-channel for model inference in embedding-based retrieval and RAG pipelines that does not require access to scores or rankings. The black-box unordered-set threat model and the extension to real RAG systems are strengths; the explicit evaluation of defenses further increases practical relevance in the security literature.
major comments (2)
- [Abstract, paragraph 3] Abstract, paragraph 3: the central claim that tailored queries produce document sets distinctive enough to identify the embedding model from a closed candidate set is load-bearing, yet the manuscript provides no visible quantitative evidence (identification accuracy, set-overlap metrics, or success-rate tables) and no description of the query-generation method, so the claim cannot be checked against data.
- [Abstract] Abstract: the assertion that certain queries remain discriminative even with a reranker is load-bearing for the attack's robustness claim, but without reported accuracy numbers under the reranker condition or details on how document-set distinctiveness is preserved, it is impossible to assess whether the attack generalizes beyond the specific tested models and corpus.
minor comments (2)
- The acronym EIA is introduced without a literature check for prior usage; a brief note on novelty of the term would improve clarity.
- [Abstract] The abstract contains several long sentences that could be split to improve readability.
Simulated Author's Rebuttal
We thank the referee for the thorough review and for highlighting the need for greater transparency in the abstract. We address the two major comments below and will make the requested revisions to improve verifiability.
read point-by-point responses
-
Referee: [Abstract, paragraph 3] Abstract, paragraph 3: the central claim that tailored queries produce document sets distinctive enough to identify the embedding model from a closed candidate set is load-bearing, yet the manuscript provides no visible quantitative evidence (identification accuracy, set-overlap metrics, or success-rate tables) and no description of the query-generation method, so the claim cannot be checked against data.
Authors: We agree that the abstract should include quantitative support for the central claim. The body of the manuscript reports identification accuracies, set-overlap metrics, and success-rate tables, together with the query-generation procedure. We will revise the abstract to incorporate representative accuracy figures and a brief description of the query-generation method so that the claim can be assessed directly from the abstract. revision: yes
-
Referee: [Abstract] Abstract: the assertion that certain queries remain discriminative even with a reranker is load-bearing for the attack's robustness claim, but without reported accuracy numbers under the reranker condition or details on how document-set distinctiveness is preserved, it is impossible to assess whether the attack generalizes beyond the specific tested models and corpus.
Authors: We acknowledge that the abstract currently omits numerical results for the reranker setting. The manuscript contains these accuracy numbers and an analysis of how set distinctiveness is retained under reranking. We will update the abstract to report the relevant accuracy figures and a short note on preservation of distinctiveness. revision: yes
Circularity Check
Empirical demonstration with no derivation chain or load-bearing self-references
full rationale
The paper is an empirical attack demonstration: it shows via experiments that tailored queries can produce distinctive unordered document sets sufficient to identify an embedding model from a closed candidate set, even with a reranker present. No equations, parameter fitting, predictions derived from fits, or self-citation chains are described in the abstract or claimed structure. The central claim rests on observed experimental distinctiveness rather than any reduction to inputs by construction. This is the most common honest non-finding for attack papers that do not invoke uniqueness theorems or ansatzes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The target system uses exactly one embedding model from a known finite candidate set.
Reference graph
Works this paper leans on
-
[1]
Maya Anderson, Guy Amit, and Abigail Goldsteen. 2025. https://doi.org/10.5220/0013108300003899 Is my data in your retrieval database? membership inference attacks against retrieval augmented generation . International Conference on Information Systems Security and Privacy, 2:474--485. Publisher Copyright: 2025 by SCITEPRESS – Science and Technology Public...
-
[2]
Muhammad Arslan, Hussam Ghanem, Saba Munawar, and Christophe Cruz. 2024. https://doi.org/10.1016/j.procs.2024.09.178 A survey on rag with llms . Procedia Computer Science, 246:3781--3790. 28th International Conference on Knowledge Based and Intelligent information and Engineering Systems (KES 2024)
-
[3]
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. 2016. Ms marco: A human generated machine reading comprehension dataset. In InCoCo@NIPS
2016
-
[4]
Matan Ben-Tov and Mahmood Sharif. 2024. https://api.semanticscholar.org/CorpusID:275133232 Gasliteing the retrieval: Exploring vulnerabilities in dense embedding-based search . Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security
2024
- [5]
-
[6]
Andrew Brown, Muhammad Roman, and Barry Devereux. 2025. https://doi.org/10.3390/bdcc9120320 A systematic literature review of retrieval-augmented generation: Techniques, metrics, and challenges . Big Data and Cognitive Computing, 9(12)
-
[7]
James Calam. 2023. https://www.pinecone.io/learn/series/rag/rerankers/ Rerankers and two-stage retrieval . Pinecone Learn
2023
-
[8]
Laura Caspari, Kanishka Ghosh Dastidar, Saber Zerhoudi, Jelena Mitrovic, and Michael Granitzer. 2024. https://ceur-ws.org/Vol-3784/short4.pdf Beyond benchmarks: Evaluating embedding model similarity for retrieval augmented generation systems . In Proceedings of the Workshop Information Retrieval's Role in RAG Systems (IR-RAG 2024) co-located with the 47th...
2024
-
[9]
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, and Xing Xie. 2024. https://doi.org/10.1145/3641289 A survey on evaluation of large language models . ACM Trans. Intell. Syst. Technol., 15(3)
-
[10]
Yiyi Chen, Qiongkai Xu, and Johannes Bjerva. 2025. https://doi.org/10.18653/v1/2025.acl-long.1185 ALGEN : Few-shot inversion attacks on textual embeddings via cross-model alignment and generation . In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 24330--24348, Vienna, Austria. Associ...
-
[11]
Cohere. 2026. https://docs.cohere.com/docs/cohere-embed Cohere's embed models (details and application) . Accessed: 2026-06-03
2026
-
[12]
Florin Cuconasu, Giovanni Trappolini, Federico Siciliano, Simone Filice, Cesare Campagnano, Yoelle Maarek, Nicola Tonellotto, and Fabrizio Silvestri. 2024. Rethinking relevance: How noise and distractors impact retrieval-augmented generation. In Proceedings of the 14th Italian Information Retrieval Workshop (IIR 2024). CEUR-WS
2024
-
[13]
Adam Dziedzic, Franziska Boenisch, Mingjian Jiang, Haonan Duan, and Nicolas Papernot. 2023. https://openreview.net/forum?id=XN5qOxI8gkz Sentence embedding encoders are easy to steal but hard to defend . In ICLR 2023 Workshop on Pitfalls of limited data and computation for Trustworthy ML
2023
-
[14]
Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, and Wei Wang. 2022. https://doi.org/10.18653/v1/2022.acl-long.62 Language-agnostic BERT sentence embedding . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 878--891, Dublin, Ireland. Association for Computational Linguistics
-
[15]
Minghao Hu, Yuxing Peng, Zhen Huang, and Dongsheng Li. 2019. https://doi.org/10.18653/v1/P19-1221 Retrieve, read, rerank: Towards end-to-end multi-document reading comprehension . In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2285--2295, Florence, Italy. Association for Computational Linguistics
-
[16]
Yu-Hsiang Huang, Yuche Tsai, Hsiang Hsiao, Hong-Yi Lin, and Shou-De Lin. 2024. https://doi.org/10.18653/v1/2024.acl-long.230 Transferable embedding inversion attack: Uncovering privacy risks in text embeddings without model queries . In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4...
-
[17]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020 a . https://doi.org/10.18653/v1/2020.emnlp-main.550 Dense passage retrieval for open-domain question answering . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769--6781, Online. ...
-
[18]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick SH Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020 b . Dense passage retrieval for open-domain question answering. In EMNLP (1), pages 6769--6781
2020
- [19]
-
[20]
Antoine Lefebvre - Brossard, Stephane Gazaille, and Michel C. Desmarais. 2023. https://doi.org/10.48550/ARXIV.2302.07738 Alloprof: a new french question-answer education dataset and its use in an information retrieval case study . CoRR, abs/2302.07738
-
[21]
Haoran Li, Mingshi Xu, and Yangqiu Song. 2023. https://doi.org/10.18653/v1/2023.findings-acl.881 Sentence embedding leaks more information than you expect: Generative embedding inversion attack to recover the whole sentence . In Findings of the Association for Computational Linguistics: ACL 2023, pages 14022--14040, Toronto, Canada. Association for Comput...
-
[22]
Chin-Yew Lin. 2004. https://aclanthology.org/W04-1013/ ROUGE : A package for automatic evaluation of summaries . In Text Summarization Branches Out, pages 74--81, Barcelona, Spain. Association for Computational Linguistics
2004
- [23]
-
[24]
Manning, Prabhakar Raghavan, and Hinrich Sch \"u tze
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch \"u tze. 2008 a . https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-unranked-retrieval-sets-1.html Evaluation of unranked retrieval sets , chapter 8. Cambridge University Press. Section 8.2
2008
-
[25]
Manning, Prabhakar Raghavan, and Hinrich Sch\" u tze
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch\" u tze. 2008 b . https://doi.org/10.1017/cbo9780511809071 Scoring, term weighting, and the vector space model , chapter 6. Cambridge University Press
-
[26]
Mintplex Labs . 2024. Anythingllm: A full-stack application that turns any document, resource, or content into context that any llm can use as references during chatting. https://github.com/mintplex-labs/anything-llm. Accessed: 2026-04-29
2024
-
[27]
John Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander Rush. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.765 Text embeddings reveal (almost) as much as text . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12448--12460, Singapore. Association for Computational Linguistics
-
[28]
Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernandez Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, and Yinfei Yang. 2022. https://doi.org/10.18653/v1/2022.emnlp-main.669 Large dual encoders are generalizable retrievers . In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9844--9855, A...
-
[29]
OpenAI. 2024. https://openai.com/index/new-embedding-models-and-api-updates/ New embedding models and api updates . Accessed: 2026-06-03
2024
- [30]
-
[31]
Kornaropoulos, and Giuseppe Ateniese
Dario Pasquini, Evgenios M. Kornaropoulos, and Giuseppe Ateniese. 2024. https://api.semanticscholar.org/CorpusID:271328475 Llmmap: Fingerprinting for large language models . ArXiv, abs/2407.15847
-
[32]
Nils Reimers and Iryna Gurevych. 2019. https://doi.org/10.18653/v1/D19-1410 Sentence- BERT : Sentence embeddings using S iamese BERT -networks . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982--3992, Hong Kong, Chi...
-
[33]
Sahel Sharifymoghaddam, Ronak Pradeep, Andre Slavescu, Ryan Nguyen, Andrew Xu, Zijian Chen, Yilin Zhang, Yidi Chen, Jasper Xian, and Jimmy Lin. 2025. https://doi.org/10.1145/3726302.3730331 Rankllm: A python package for reranking with llms . In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval,...
-
[34]
Manveer Singh Tamber, Jasper Xian, and Jimmy Lin. 2025. https://doi.org/10.18653/v1/2025.findings-naacl.104 Can ' t hide behind the API : Stealing black-box commercial embedding models . In Findings of the Association for Computational Linguistics: NAACL 2025, pages 1958--1969, Albuquerque, New Mexico. Association for Computational Linguistics
-
[35]
Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2024. https://arxiv.org/abs/2212.03533 Text embeddings by weakly-supervised contrastive pre-training . Preprint, arXiv:2212.03533
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[36]
Xingrui Xie, Han Liu, Wenzhe Hou, and Hongbin Huang. 2023. https://doi.org/10.1109/BigDIA60676.2023.10429609 A brief survey of vector databases . In 2023 9th International Conference on Big Data and Information Analytics (BigDIA), pages 364--371
-
[37]
Shenglai Zeng, Jiankun Zhang, Pengfei He, Yiding Liu, Yue Xing, Han Xu, Jie Ren, Yi Chang, Shuaiqiang Wang, Dawei Yin, and Jiliang Tang. 2024. https://doi.org/10.18653/v1/2024.findings-acl.267 The good and the bad: Exploring privacy issues in retrieval-augmented generation ( RAG ) . In Findings of the Association for Computational Linguistics: ACL 2024, p...
-
[38]
Collin Zhang, John X. Morris, and Vitaly Shmatikov. 2025 a . https://api.semanticscholar.org/CorpusID:277467864 Universal zero-shot embedding inversion . ArXiv, abs/2504.00147
-
[39]
Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, Fei Huang, and Jingren Zhou. 2025 b . Qwen3 embedding: Advancing text embedding and reranking through foundation models. arXiv preprint arXiv:2506.05176
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [40]
-
[41]
Yujia Zhou, Yan Liu, Xiaoxi Li, Jiajie Jin, Hongjin Qian, Zheng Liu, Chaozhuo Li, Zhicheng Dou, Tsung-Yi Ho, and Philip S. Yu. 2024. https://api.semanticscholar.org/CorpusID:272689561 Trustworthiness in retrieval-augmented generation systems: A survey . ArXiv, abs/2409.10102
work page internal anchor Pith review Pith/arXiv arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.