arxiv: 2605.04116 · v1 · submitted 2026-05-05 · 💻 cs.CR · cs.LG

Recognition: 2 theorem links

· Lean Theorem

Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

Tejas Kulkarni , Antti Koskela , Laith Zumot

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:39 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords membership inferencein-context learningretrieval augmenteddocument question answeringblack-box attacksprivacy leakageparaphrase resilience

0 comments

The pith

Retrieval-based in-context learning systems leak whether specific documents are in the retrieval database through simple prefix queries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that applications using retrieval to pick in-context examples for document question answering can reveal membership of individual documents to an adversary who only sends queries. It introduces two black-box attacks that score query prefixes to tell member documents from non-members, with one version using a reference model and a stronger version replacing it with a weighted average of prefix scores. These attacks remain effective even when the adversary sees only paraphrased versions of the original text and they beat three earlier methods in many settings. An ensemble-prompting defense reduces but does not remove the leakage from the stronger attack. If the claim holds, service providers using retrieval-augmented in-context learning would need new privacy safeguards to protect the contents of their retrieval stores.

Core claim

Black-box membership-inference attacks on retrieval-augmented in-context learning for document question answering can be carried out by exploiting statistics on prefixes of the user query; a novel weighted-averaging scheme produces a membership score without requiring a reference model and maintains effectiveness against paraphrased member text.

What carries the argument

Prefix-based membership statistic that measures how retrieval similarity changes when successive prefixes of the query are supplied, either via reference-model loss or direct weighted averaging.

If this is right

Remote services that combine retrieval with in-context learning expose private membership information about their document collections.
The attacks succeed with only a small number of prefixes and against paraphrased inputs.
A simple ensemble-prompting defense substantially lowers leakage from the weighted-average attack.
The new attacks outperform three prior membership-inference methods on this task in many evaluated cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prefix-scoring idea could be tested on other retrieval-augmented generation tasks beyond question answering.
Randomizing or adding noise to the retrieval ranking might be a practical countermeasure worth measuring.
The reference-model-free weighted-average statistic may apply to other black-box settings where loss values are unavailable.

Load-bearing premise

The retrieval function picks examples by similarity to the query such that prefix statistics can still separate member documents from non-members even after the text has been paraphrased.

What would settle it

Running the attacks on a retrieval system whose similarity function has been replaced by uniform random selection and finding that accuracy falls to chance level.

Figures

Figures reproduced from arXiv: 2605.04116 by Antti Koskela, Laith Zumot, Tejas Kulkarni.

**Figure 4.** Figure 4: The function ϕ is the most interesting part of the attack. Irrespective of membership status, the quality of target model’s response is less likely to change for large enough prefix indices. For smaller prefixes however, and target model could behave differently in member and non-member case. In case of membership, model could still approximately answer due its access to the entire text in the context. In … view at source ↗

**Figure 1.** Figure 1: The distribution of mean membership scores for the attack from view at source ↗

**Figure 4.** Figure 4: Flow diagram for the attack from Algorithm 2 view at source ↗

read the original abstract

We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be vulnerable to membership-inference attacks even when the service provider and users are separate parties. We propose two black-box membership inference attacks that exploit query text prefixes to distinguish member from non-member inputs. The first attack uses a reference model to estimate an otherwise unavailable loss metric. The second attack improves upon it by eliminating the reference model and instead computing a membership statistic through a simple but novel weighted-averaging scheme. Our comprehensive empirical evaluations consider a stricter case in which the adversary has a paraphrased version of the text in the queries and show that our attacks can exhibit stronger resilience to paraphrasing and outperform three prior attacks in many cases with small number of prefixes. We also adapt an existing ensemble prompting defense to our setting, demonstrating that it substantially mitigates the privacy leakage caused by our second attack.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a reference-free weighted-average attack on membership in retrieval-augmented document QA that holds up better to paraphrased queries than the three baselines they test.

read the letter

The main point is that retrieval-based in-context learning for document QA can leak whether a document was in the retrieval corpus, and the authors supply a simple black-box attack that does not need a reference model. Their weighted-averaging statistic on query prefixes outperforms prior attacks in many of their runs even when the adversary only has paraphrased versions of the member text. They also show that an adapted ensemble-prompting defense reduces the leakage from this attack substantially.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes two black-box membership inference attacks on retrieval-augmented in-context learning systems for document question answering. The attacks exploit statistics computed over query text prefixes to distinguish member from non-member documents. The first attack employs a reference model to estimate an unavailable loss; the second replaces the reference model with a novel weighted-averaging scheme over prefix statistics. Comprehensive experiments on paraphrased queries demonstrate that both attacks remain effective, outperform three prior attacks in many settings even with few prefixes, and that an adapted ensemble-prompting defense substantially reduces leakage from the second attack.

Significance. If the empirical results hold, the work identifies a concrete privacy risk in practical RAG-ICL deployments where the service provider and end users are distinct parties. The stricter paraphrased-query threat model and the demonstration that attacks succeed with small numbers of prefixes are practically relevant. The reference-model-free weighted-averaging attack and the adapted defense are constructive contributions. The manuscript supplies reproducible empirical evaluations and falsifiable attack definitions that can be tested on other retrievers and corpora.

major comments (2)

[§4 and abstract] §4 (Empirical Evaluations) and the abstract: the central claim that the attacks exhibit 'stronger resilience to paraphrasing' and outperform prior attacks rests on the retrieval function continuing to surface member documents preferentially on the basis of prefix statistics even after semantic rewriting. The manuscript does not report whether the retriever is lexical or embedding-based, nor does it include an ablation that replaces the retriever with a standard semantic embedding model while keeping the same paraphrases. If embedding-based retrieval is used, paraphrases can preserve similarity scores while scrambling prefix distributions, which would make the observed outperformance an artifact of the specific retriever rather than a general property of prefix-based attacks.
[§3] §3 (Attack 2, weighted-averaging scheme): the membership statistic is defined via a simple weighted average over prefixes, yet the manuscript does not specify how the weights are computed or whether they depend on any statistics of the query distribution. If the weights are derived from the same corpus that the adversary is trying to attack, the scheme is no longer strictly black-box with respect to the target retrieval corpus; this must be clarified because it directly affects the attack's claimed practicality.

minor comments (2)

All tables reporting AUC or accuracy should include the exact number of prefixes used, the paraphrasing method, and the retrieval model (including embedding dimension or lexical metric) so that the 'small number of prefixes' claim can be reproduced.
The description of the adapted ensemble-prompting defense should include the exact prompt templates and the number of ensemble members so that the mitigation results can be verified independently.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will incorporate clarifications and revisions into the next version of the manuscript.

read point-by-point responses

Referee: [§4 and abstract] §4 (Empirical Evaluations) and the abstract: the central claim that the attacks exhibit 'stronger resilience to paraphrasing' and outperform prior attacks rests on the retrieval function continuing to surface member documents preferentially on the basis of prefix statistics even after semantic rewriting. The manuscript does not report whether the retriever is lexical or embedding-based, nor does it include an ablation that replaces the retriever with a standard semantic embedding model while keeping the same paraphrases. If embedding-based retrieval is used, paraphrases can preserve similarity scores while scrambling prefix distributions, which would make the observed outperformance an artifact of the specific retriever rather than a general property of prefix-based attacks.

Authors: We agree that the retriever type must be explicitly stated to allow proper interpretation of the paraphrasing results. We will revise §4 and the abstract to clearly report the retrieval function used in all experiments. We will also add a discussion of the implications for lexical versus embedding-based retrievers and note that our empirical claims are tied to the evaluated retrieval setup. An ablation replacing the retriever with a standard semantic embedding model while reusing the same paraphrases would strengthen generality; we will include this as a new experiment in the revision if feasible, or otherwise expand the limitations section to address the concern directly. revision: partial
Referee: [§3] §3 (Attack 2, weighted-averaging scheme): the membership statistic is defined via a simple weighted average over prefixes, yet the manuscript does not specify how the weights are computed or whether they depend on any statistics of the query distribution. If the weights are derived from the same corpus that the adversary is trying to attack, the scheme is no longer strictly black-box with respect to the target retrieval corpus; this must be clarified because it directly affects the attack's claimed practicality.

Authors: We thank the referee for highlighting this ambiguity. The weights are computed using only the lengths of the available query prefixes (longer prefixes receive proportionally higher weight, normalized to sum to one) and do not depend on any statistics from the target retrieval corpus or the query distribution of the attacked system. No corpus-specific information is required or used. We will revise §3 to include the precise weighting formula and an explicit statement that the attack remains strictly black-box with respect to the target corpus. revision: yes

Circularity Check

0 steps flagged

No significant circularity in attack definitions or empirical claims

full rationale

The paper defines its two black-box membership inference attacks via direct computations: one using a reference model to estimate loss on query prefixes, and the second via a weighted-averaging scheme on model outputs. These are algorithmic procedures, not mathematical derivations. The central claims rest on empirical evaluations (including paraphrased-query settings and comparisons to three prior attacks) rather than any equations, fitted parameters renamed as predictions, or self-citation chains that bear the load. No steps exhibit self-definitional loops, ansatz smuggling, or uniqueness theorems imported from the authors' prior work. The work is self-contained against external benchmarks and matches the reader's assessment of non-circular empirical construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is purely empirical and introduces no new mathematical axioms, free parameters, or invented entities; it relies on standard assumptions about black-box access and retrieval similarity.

pith-pipeline@v0.9.0 · 5460 in / 1043 out tokens · 33773 ms · 2026-05-08T18:39:48.658657+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost (J(x)=½(x+x⁻¹)−1) washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The functionϕis the most interesting part of the attack... We would like our score function to amplify early signals and provide diminishing returns for the subsequent answers. This can be achieved by using a decaying function such asϕ(i) = 1/i or 1/log(i).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 4 canonical work pages · 1 internal anchor

[1]

In-context ex- amples selection for machine translation

Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, and Marjan Ghazvininejad. In-context ex- amples selection for machine translation. InFindings of the Association for Computational Linguistics: ACL 2023, pages 8857–8873. Association for Computational Linguistics, 2023

2023
[2]

The llama 3 herd of models, 2024

AI@Meta. The llama 3 herd of models, 2024

2024
[3]

Private prediction for large-scale synthetic text generation

Kareem Amin, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva, Umar Syed, Andreas Terzis, and Sergei Vassilvitskii. Private prediction for large-scale synthetic text generation. InFindings of the Associa- tion for Computational Linguistics: EMNLP 2024, pages 7244–7262, 2024

2024
[4]

Datasheet for the pile

Stella Biderman, Kieran Bicheno, and Leo Gao. Datasheet for the pile.arXiv preprint arXiv:2201.07311, 2022

work page arXiv 2022
[5]

Pythia: A suite for analyzing large language models across training and scaling

Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Moham- mad Aflah Khan, Shivanshu Purohit, USVSN Sai Singh, , , et al. Pythia: A suite for analyzing large language models across training and scaling. InProceedings of the 40th International Conference on Machine Learning, volume 202, pages 2399–2415. PMLR, 2023

2023
[6]

impact of sample selection on in-context learning for entity extraction from scientific writing

Necva B ¨ol¨uc¨u, Maciej Rybinski, and Stephen Wan. impact of sample selection on in-context learning for entity extraction from scientific writing. InFindings of the Association for Computational Linguistics: EMNLP 2023, December 2023

2023
[7]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Dario Amodei, Alec Radford, Ilya Sutskever, and Jack Cla...

work page arXiv 2020
[8]

Membership in- ference attacks from first principles

Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tram `er. Membership in- ference attacks from first principles. In2022 IEEE Symposium on Security and Privacy (SP), pages 1897– 1914, 2022

1914
[9]

Brown, Dawn Xiaodong Song, ´Ulfar Erlingsson, Alina Oprea, and Colin Raffel

Nicholas Carlini, Florian Tram `er, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Katherine Lee, Adam Roberts, Tom B. Brown, Dawn Xiaodong Song, ´Ulfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. InUSENIX Security Symposium, 2020

2020
[10]

Choquette-Choo, Matthew Jagielski, Mi- lad Nasr, Eric Wallace, and Florian Tram `er

Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, Mi- lad Nasr, Eric Wallace, and Florian Tram `er. Privacy side channels in machine learning systems. InProceedings of the 33rd USENIX Conference on Security Symposium, USA, 2024. USENIX Association

2024
[11]

Gemma: Open Models Based on Gemini Research and Technology

Google DeepMind and Google. Gemma: Open models based on gemini research and technology.arXiv preprint arXiv:2403.08295, 2024

work page internal anchor Pith review arXiv 2024
[12]

A survey on in-context learning

Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li, and Zhifang Sui. A survey on in-context learning. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, November 2024

2024
[13]

The faiss library

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazar ´e, Maria Lomeli, Lucas Hosseini, and Herv ´e J ´egou. The faiss library. 2024

2024
[14]

Flocks of stochastic parrots: Dif- ferentially private prompt learning for large language models

Haonan Duan, Adam Dziedzic, Nicolas Papernot, and Franziska Boenisch. Flocks of stochastic parrots: Dif- ferentially private prompt learning for large language models. InAdvances in Neural Information Processing TABLE VIII THE TABLE COMPARES CROSS-VALIDATED METRICS FOR THE SECOND ATTACK AGAINST1-SHOTICLFOR SEVERAL VALUES OFmWHEN THE ENSEMBLE PROMPTING DEF...

2023
[15]

On the privacy risk of in-context learning, 2024

Haonan Duan, Adam Dziedzic, Mohammad Yaghini, Nicolas Papernot, and Franziska Boenisch. On the privacy risk of in-context learning, 2024

2024
[16]

Do membership inference attacks work on large language models? InConference on Language Modeling (COLM), 2024

Michael Duan, Anshuman Suri, Niloofar Mireshghal- lah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, and Hannaneh Ha- jishirzi. Do membership inference attacks work on large language models? InConference on Language Modeling (COLM), 2024

2024
[17]

Data-adaptive differentially private prompt synthesis for in-context learning

Fengyu Gao, Ruida Zhou, Tianhao Wang, Cong Shen, and Jing Yang. Data-adaptive differentially private prompt synthesis for in-context learning. InThe Thir- teenth International Conference on Learning Represen- tations, 2025

2025
[18]

Demystifying prompts in language models via perplexity estimation

Hila Gonen, Srini Iyer, Terra Blevins, Noah Smith, and Luke Zettlemoyer. Demystifying prompts in language models via perplexity estimation. InFindings of the As- sociation for Computational Linguistics: EMNLP 2023, pages 10136–10148. Association for Computational Lin- guistics, December 2023

2023
[19]

A Survey on LLM-as-a-Judge

Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xue- hao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, Yuanzhuo Wang, and Jian Guo. A survey on llm-as-a-judge.CoRR, abs/2411.15594, 2024

work page Pith review arXiv 2024
[20]

Choquette-Choo, and Zheng Xu

Nikhil Kandpal, Krishna Pillutla, Alina Oprea, Peter Kairouz, Christopher A. Choquette-Choo, and Zheng Xu. User inference attacks on large language models. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, November 2024

2024
[21]

Differ- entially private in-context learning with nearest neighbor search, 2025

Antti Koskela, Tejas Kulkarni, and Laith Zumot. Differ- entially private in-context learning with nearest neighbor search, 2025

2025
[22]

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. What makes good in-context examples for GPT-3? InProceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Work- shop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, May 2022

2022
[23]

Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, 2022

2022
[24]

Sum- mQA at MEDIQA-chat 2023: In-context learning with GPT-4 for medical summarization

Yash Mathur, Sanketh Rangreji, Raghav Kapoor, Medha Palavalli, Amanda Bertsch, and Matthew Gormley. Sum- mQA at MEDIQA-chat 2023: In-context learning with GPT-4 for medical summarization. InProceedings of the 5th Clinical Natural Language Processing Workshop, pages 490–502. Association for Computational Linguis- tics, July 2023

2023
[25]

Membership inference attacks against language models via neighbourhood comparison

Justus Mattern, Fatemehsadat Mireshghallah, Zhijing Jin, Bernhard Schoelkopf, Mrinmaya Sachan, and Taylor Berg-Kirkpatrick. Membership inference attacks against language models via neighbourhood comparison. In Findings of the Association for Computational Linguis- tics: ACL 2023. Association for Computational Linguis- tics, 2023

2023
[26]

The effect of natural distribution shift on question answering models

John Miller, Karl Krauth, Benjamin Recht, and Ludwig Schmidt. The effect of natural distribution shift on question answering models. InProceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 6905–6916. PMLR, 2020

2020
[27]

Quan- tifying privacy risks of masked language models using membership inference attacks

Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, and Reza Shokri. Quan- tifying privacy risks of masked language models using membership inference attacks. InProceedings of the 2022 Conference on Empirical Methods in Natural Lan- guage Processing, pages 8332–8347, December 2022

2022
[28]

Med-flamingo: a multimodal medical few-shot learner

Michael Moor, Qian Huang, Shirley Wu, Michihiro Ya- sunaga, Yash Dalmia, Jure Leskovec, Cyril Zakka, Ed- uardo Pontes Reis, and Pranav Rajpurkar. Med-flamingo: a multimodal medical few-shot learner. InProceedings of the 3rd Machine Learning for Health Symposium, Proceedings of Machine Learning Research, 2023

2023
[29]

Know what you don’t know: Unanswerable questions for SQuAD

Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you don’t know: Unanswerable questions for SQuAD. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Vol- ume 2: Short Papers), Melbourne, Australia, July 2018. Association for Computational Linguistics

2018
[30]

Sentence-bert: Sen- tence embeddings using siamese bert-networks

Nils Reimers and Iryna Gurevych. Sentence-bert: Sen- tence embeddings using siamese bert-networks. InPro- ceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Interna- tional Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992. Association for Computational Linguistics, 2019

2019
[31]

Membership inference attacks against machine learning models

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In2017 IEEE Symposium on Security and Privacy (SP), pages 3–18, 2017

2017
[32]

Privacy- preserving in-context learning with differentially private few-shot generation

Xinyu Tang, Richard Shin, Huseyin A Inan, Andre Ma- noel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, and Robert Sim. Privacy- preserving in-context learning with differentially private few-shot generation. InThe Twelfth International Con- ference on Learning Representations, 2024

2024
[33]

NewsQA: A machine comprehension dataset

Adam Trischler, Tong Wang, Xingdi Yuan, Justin Har- ris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. NewsQA: A machine comprehension dataset. InProceedings of the 1st Workshop on Representation Learning for NLP, Vancouver, Canada, 2017. Association for Computational Linguistics

2017
[34]

Membership inference attacks against in-context learn- ing

Rui Wen, Zheng Li, Michael Backes, and Yang Zhang. Membership inference attacks against in-context learn- ing. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, CCS 2024, pages 3481–3495. ACM, 2024

2024
[35]

Privacy-preserving in-context learning for large language models

Tong Wu, Ashwinee Panda, Jiachen T Wang, and Prateek Mittal. Privacy-preserving in-context learning for large language models. InThe Twelfth International Confer- ence on Learning Representations, 2024

2024
[36]

Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, and Philip S. Yu. Machine unlearning: A survey.ACM Comput. Surv., 56(1)
[37]

Privacy Risk in Machine Learning: An- alyzing the Connection to Overfitting

Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Privacy Risk in Machine Learning: An- alyzing the Connection to Overfitting . In2018 IEEE 31st Computer Security Foundations Symposium (CSF), pages 268–282, Los Alamitos, CA, USA, July 2018. IEEE Computer Society

2018
[38]

Counterfactual memorization in neural language models

Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tram`er, and Nicholas Carlini. Counterfactual memorization in neural language models. InProceedings of the 37th International Conference on Neural Information Processing Systems, NeurlPS ’23, 2023

2023
[39]

Active example selection for in-context learning

Yiming Zhang, Shi Feng, and Chenhao Tan. Active example selection for in-context learning. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9134–9148. Association for Computational Linguistics, December 2022. APPENDIX A. Example DQA Prompt template For question answering task,we used the following prompt temp...

2022