Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems
Pith reviewed 2026-06-27 00:26 UTC · model grok-4.3
The pith
Editing retriever parameters lets attackers inject malicious passages into RAG results without changing the knowledge base.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CAREATTACK adapts efficient closed-form parameter editing to dense retrieval models, promoting malicious knowledge above benign competing passages and resolving potential parameter conflicts through graph-based conflict detection and parameter editing projection, then performs attack-preserving anchor repair that calibrates the edited retriever to eliminate impact on non-target prompts while preserving attack effectiveness for target prompts.
What carries the argument
Conflict-aware retriever editing, which adapts closed-form parameter editing to dense retrieval models and uses graph-based conflict detection with projection to promote malicious passages.
If this is right
- Malicious passages enter retrieved results for targeted prompts without any changes to the external corpus.
- The attack scales to batches of prompts and passages when retriever parameters are available.
- Anchor repair keeps attack success on targets while reducing effects on unrelated queries.
- Open-source retrievers create a practical attack vector for RAG systems.
Where Pith is reading between the lines
- Corpus-focused detection methods may fail against this attack since the stored text is unaltered.
- Securing retriever parameters or adding monitoring for parameter changes could become a necessary defense layer.
- The editing technique may extend to other dense retrieval settings outside RAG.
- Experiments on larger models would test whether conflict resolution remains effective at scale.
Load-bearing premise
Closed-form parameter editing can be adapted to dense retrievers to promote malicious passages over benign ones while graph-based conflict resolution prevents detectable degradation.
What would settle it
Retrieval rankings after the editing procedure show no increase in position for the malicious passages relative to benign competitors on the same target prompts and models.
Figures
read the original abstract
Injecting malicious knowledge into retrieval-augmented generation (RAG) systems can manipulate retrieved evidence and mislead downstream generation, posing a serious security threat for AI applications. Existing RAG injection attacks mainly rely on manipulating external knowledge bases, such as crafting malicious corpus. However, the synthetic text crafted by such data-centric methods could be detectable, leading to the failure of attacks. Beyond corpus manipulation, open-source retrievers are increasingly exposing RAG systems to model-centric attacks. In this paper, we propose conflict-aware retriever editing, i.e., CAREATTACK, a model-centric retriever attack framework for malicious knowledge injection in RAG. Specifically, CAREATTACK consists two stages of conflict-aware retriever editing and attack-preserving anchor repair. Conflict-aware retriever editing adapts efficient closed-form parameter editing to the dense retrieval model, promoting malicious knowledge above benign competing passages and resolving potential parameter conflicts through graph-based conflict detection and parameter editing projection. Then, attack-preserving anchor repair performs lightweight calibration on the edited retriever to further eliminate the impact on non-target prompts while preserving the attack effectiveness for target prompts. We instantiate CAREATTACK on Qwen3-Embedding-0.6B and BGE-M3, and conduct evaluation on three benchmark datasets. Experimental results demonstrate our method substantially promote malicious passages into the retrieved knowledge of RAG systems and can perform attacks for batches of target prompts and passages, given the access of retrieval model parameters. Since most RAG systems are built upon open-source retrieval models, this work reveals a practical attack surface in RAG systems. Codes are public accessible at https://anonymous.4open.science/r/CareAttack-3F1C.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CAREATTACK, a model-centric attack framework for malicious knowledge injection into LLM-based RAG systems via parameter editing of the dense retriever. It consists of a conflict-aware retriever editing stage that adapts closed-form editing with graph-based conflict detection and projection to promote malicious passages above benign competitors, followed by an attack-preserving anchor repair stage for lightweight calibration on non-target prompts. The authors instantiate the method on Qwen3-Embedding-0.6B and BGE-M3 and claim that experiments on three benchmark datasets demonstrate substantial promotion of malicious passages and batch-attack capability when retrieval model parameters are accessible.
Significance. If the empirical claims hold with adequate quantitative support, the work identifies a previously under-explored attack surface on open-source retrievers that underpins many RAG deployments. This is relevant to the security of retrieval-augmented systems. The public code release is a positive factor for reproducibility.
major comments (2)
- [Method (conflict-aware retriever editing stage)] The description of conflict-aware retriever editing states that the method 'adapts efficient closed-form parameter editing to the dense retrieval model' and uses 'graph-based conflict detection and parameter editing projection,' but supplies no equation, derivation, or explicit update rule showing how the closed-form solution is obtained for a bi-encoder or cross-encoder under a contrastive/ranking objective, nor how the projection preserves the necessary margin for malicious-passage promotion in embedding space. This is load-bearing for the central claim that the two-stage process raises malicious scores above benign competitors without side effects.
- [Experimental evaluation / abstract claim] The abstract asserts that 'experimental results demonstrate our method substantially promote malicious passages' and enable 'attacks for batches of target prompts and passages' on three datasets, yet the provided text contains no success rates, retrieval metrics (e.g., recall@K deltas), baselines, ablation results, or measurements of degradation on non-target queries. Without these, the support for the claim of effective, low-side-effect promotion cannot be evaluated.
minor comments (2)
- [Abstract] Grammatical issues in the abstract: 'consists two stages' should read 'consists of two stages'; 'substantially promote' should be 'substantially promotes'.
- [Abstract] The phrase 'given the access of retrieval model parameters' is awkward; 'given access to the retrieval model parameters' would be clearer.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments point by point below and will revise the manuscript to improve clarity and completeness.
read point-by-point responses
-
Referee: [Method (conflict-aware retriever editing stage)] The description of conflict-aware retriever editing states that the method 'adapts efficient closed-form parameter editing to the dense retrieval model' and uses 'graph-based conflict detection and parameter editing projection,' but supplies no equation, derivation, or explicit update rule showing how the closed-form solution is obtained for a bi-encoder or cross-encoder under a contrastive/ranking objective, nor how the projection preserves the necessary margin for malicious-passage promotion in embedding space. This is load-bearing for the central claim that the two-stage process raises malicious scores above benign competitors without side effects.
Authors: We agree that the current description lacks the necessary mathematical detail. In the revision we will insert a dedicated subsection that (i) states the contrastive objective used for the bi-encoder, (ii) derives the closed-form parameter update rule, (iii) formalizes the graph construction for conflict detection, and (iv) specifies the projection operator together with the margin-preservation argument in embedding space. These additions will make the adaptation of closed-form editing fully explicit and reproducible. revision: yes
-
Referee: [Experimental evaluation / abstract claim] The abstract asserts that 'experimental results demonstrate our method substantially promote malicious passages' and enable 'attacks for batches of target prompts and passages' on three datasets, yet the provided text contains no success rates, retrieval metrics (e.g., recall@K deltas), baselines, ablation results, or measurements of degradation on non-target queries. Without these, the support for the claim of effective, low-side-effect promotion cannot be evaluated.
Authors: We will strengthen the experimental reporting. The revised manuscript will (i) augment the abstract with concrete success rates and recall@K deltas, (ii) add a concise quantitative summary table in the introduction, and (iii) ensure that all tables and figures explicitly report baseline comparisons, ablation results, and non-target query degradation metrics. These changes will supply the quantitative evidence required to evaluate the claims. revision: yes
Circularity Check
No circularity: empirical attack method with experimental validation
full rationale
The paper describes an empirical attack framework (CAREATTACK) that adapts closed-form editing techniques from prior LLM work to dense retrievers, adds graph-based conflict detection and anchor repair, then validates via experiments on Qwen3-Embedding-0.6B, BGE-M3 and three datasets. No derivation, equation, or central claim reduces to its own inputs by construction; the method is a procedural construction whose effectiveness is measured externally rather than asserted via self-referential fitting or self-citation chains. The load-bearing steps are implementation choices and empirical outcomes, not algebraic identities.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Closed-form parameter editing can be adapted to dense retrieval models to promote malicious passages above benign ones while resolving conflicts via graph-based detection and projection.
Reference graph
Works this paper leans on
-
[1]
Retrieval- augmented generation for knowledge-intensive nlp tasks,
P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K ¨uttler, M. Lewis, W.-t. Yih, T. Rockt ¨aschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020
2020
-
[2]
Retrieval augmented language model pre-training,
K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “Retrieval augmented language model pre-training,” inInternational conference on machine learning. PMLR, 2020, pp. 3929–3938
2020
-
[3]
Dense passage retrieval for open-domain question answering,
V . Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense passage retrieval for open-domain question answering,” inProceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), 2020, pp. 6769– 6781
2020
-
[4]
Colbert: Efficient and effective passage search via contextualized late interaction over bert,
O. Khattab and M. Zaharia, “Colbert: Efficient and effective passage search via contextualized late interaction over bert,” inProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 39–48
2020
-
[5]
Approximate nearest neighbor negative contrastive learning for dense text retrieval,
L. Xiong, C. Xiong, Y . Li, K.-F. Tang, J. Liu, P. Bennett, J. Ahmed, and A. Overwijk, “Approximate nearest neighbor negative contrastive learning for dense text retrieval,”arXiv preprint arXiv:2007.00808, 2020
arXiv 2007
-
[6]
{PoisonedRAG}: Knowl- edge corruption attacks to{Retrieval-Augmented}generation of large language models,
W. Zou, R. Geng, B. Wang, and J. Jia, “{PoisonedRAG}: Knowl- edge corruption attacks to{Retrieval-Augmented}generation of large language models,” in34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 3827–3844
2025
-
[7]
Poisoning retrieval corpora by injecting adversarial passages,
Z. Zhong, Z. Huang, A. Wettig, and D. Chen, “Poisoning retrieval corpora by injecting adversarial passages,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 13 764–13 775
2023
-
[8]
Badrag: Identifying vulnerabilities in retrieval augmented generation of large language models,
J. Xue, M. Zheng, Y . Hu, F. Liu, X. Chen, and Q. Lou, “Badrag: Identifying vulnerabilities in retrieval augmented generation of large language models,”arXiv preprint arXiv:2406.00083, 2024
arXiv 2024
-
[9]
Not what you’ve signed up for: Compromising real- world llm-integrated applications with indirect prompt injection,
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real- world llm-integrated applications with indirect prompt injection,” in Proceedings of the 16th ACM workshop on artificial intelligence and security, 2023, pp. 79–90
2023
-
[10]
Emorag: Evaluating rag robustness to symbolic perturbations,
X. Zhou, X. Li, Y . Peng, M. Xu, X. Zhang, M. Yu, Y . Wang, X. Jia, K. Wang, Q. Wenet al., “Emorag: Evaluating rag robustness to symbolic perturbations,” inProceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, 2026, pp. 2100–2111
2026
-
[11]
Whispers in grammars: Injecting covert backdoors to compromise dense retrieval systems,
Q. Long, Y . Deng, L. Gan, W. Wang, and S. Jialin Pan, “Whispers in grammars: Injecting covert backdoors to compromise dense retrieval systems,”arXiv e-prints, pp. arXiv–2402, 2024
2024
-
[12]
Trojanrag: Retrieval-augmented generation can be backdoor driver in large language models,
P. Cheng, Y . Ding, T. Ju, Z. Wu, W. Du, P. Yi, Z. Zhang, and G. Liu, “Trojanrag: Retrieval-augmented generation can be backdoor driver in large language models,”arXiv preprint arXiv:2405.13401, 2024
arXiv 2024
-
[13]
C. Clop and Y . Teglia, “Backdoored retrievers for prompt injection attacks on retrieval augmented generation of large language models,” arXiv preprint arXiv:2410.14479, 2024
arXiv 2024
-
[14]
Trustrag: enhancing robustness and trustworthiness in retrieval-augmented generation,
H. Zhou, K.-H. Lee, Z. Zhan, Y . Chen, Z. Li, Z. Wang, H. Haddadi, and E. Yilmaz, “Trustrag: enhancing robustness and trustworthiness in retrieval-augmented generation,”arXiv preprint arXiv:2501.00879, 2025
arXiv 2025
-
[15]
Lora: Low-rank adaptation of large language mod- els
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language mod- els.”Iclr, vol. 1, no. 2, p. 3, 2022
2022
-
[16]
Qwen3 embedding: Advancing text embedding and reranking through foundation models,
Y . Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Linet al., “Qwen3 embedding: Advancing text embedding and reranking through foundation models,”arXiv preprint arXiv:2506.05176, 2025
Pith/arXiv arXiv 2025
-
[17]
M.-L. M.-F. Multi-Granularity, “M3-embedding: Multi-linguality, multi-functionality, multi-granularity text embeddings through self- knowledge distillation,”arXiv preprint arXiv:2402.03216, 2024
Pith/arXiv arXiv 2024
-
[18]
Natural questions: a benchmark for question answering research,
T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Leeet al., “Natural questions: a benchmark for question answering research,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 453–466, 2019
2019
-
[19]
Ms marco: A human generated machine reading comprehension dataset,
P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu, R. Ma- jumder, A. McNamara, B. Mitra, T. Nguyenet al., “Ms marco: A human generated machine reading comprehension dataset,”arXiv preprint arXiv:1611.09268, 2016
Pith/arXiv arXiv 2016
-
[20]
Hotpotqa: A dataset for diverse, explainable multi- hop question answering,
Z. Yang, P. Qi, S. Zhang, Y . Bengio, W. Cohen, R. Salakhutdinov, and C. D. Manning, “Hotpotqa: A dataset for diverse, explainable multi- hop question answering,” inProceedings of the 2018 conference on empirical methods in natural language processing, 2018, pp. 2369– 2380
2018
-
[21]
Leveraging passage retrieval with gener- ative models for open domain question answering,
G. Izacard and E. Grave, “Leveraging passage retrieval with gener- ative models for open domain question answering,” inProceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, 2021, pp. 874–880
2021
-
[22]
Improving language models by retrieving from trillions of tokens,
S. Borgeaud, A. Mensch, J. Hoffmann, T. Cai, E. Rutherford, K. Mil- lican, G. B. Van Den Driessche, J.-B. Lespiau, B. Damoc, A. Clark et al., “Improving language models by retrieving from trillions of tokens,” inInternational conference on machine learning. PMLR, 2022, pp. 2206–2240
2022
-
[23]
Retrieval-augmented generation for large language models: A survey,
Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, H. Wang, H. Wanget al., “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, vol. 2, no. 1, p. 32, 2023
Pith/arXiv arXiv 2023
-
[24]
Sentence-bert: Sentence embeddings using siamese bert-networks,
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” inProceedings of the 2019 confer- ence on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), 2019, pp. 3982–3992
2019
-
[25]
Unsupervised dense information retrieval with contrastive learning,
G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bojanowski, A. Joulin, and E. Grave, “Unsupervised dense information retrieval with contrastive learning,”arXiv preprint arXiv:2112.09118, 2021
Pith/arXiv arXiv 2021
-
[26]
Large dual encoders are generalizable retrievers,
J. Ni, C. Qu, J. Lu, Z. Dai, G. H. Abrego, J. Ma, V . Zhao, Y . Luan, K. Hall, M.-W. Changet al., “Large dual encoders are generalizable retrievers,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 9844–9855
2022
-
[27]
Text embeddings by weakly-supervised con- trastive pre-training,
L. Wang, N. Yang, X. Huang, B. Jiao, L. Yang, D. Jiang, R. Ma- jumder, and F. Wei, “Text embeddings by weakly-supervised con- trastive pre-training,”arXiv preprint arXiv:2212.03533, 2022
Pith/arXiv arXiv 2022
-
[28]
Lost in the middle: How language models use long contexts,
N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilacqua, F. Petroni, and P. Liang, “Lost in the middle: How language models use long contexts,”Transactions of the association for computational linguis- tics, vol. 12, pp. 157–173, 2024
2024
-
[29]
Large language models can be easily distracted by irrelevant context,
F. Shi, X. Chen, K. Misra, N. Scales, D. Dohan, E. H. Chi, N. Sch ¨arli, and D. Zhou, “Large language models can be easily distracted by irrelevant context,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 31 210–31 227
2023
-
[30]
The power of noise: Redefining retrieval for rag systems,
F. Cuconasu, G. Trappolini, F. Siciliano, S. Filice, C. Campagnano, Y . Maarek, N. Tonellotto, and F. Silvestri, “The power of noise: Redefining retrieval for rag systems,” inProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024, pp. 719–729
2024
-
[31]
How faithful are rag models? quantifying the tug-of-war between rag and llms’ internal prior,
K. Wu, E. Wu, and J. Zou, “How faithful are rag models? quantifying the tug-of-war between rag and llms’ internal prior,”arXiv preprint arXiv:2404.10198, vol. 3, no. 1, 2024
arXiv 2024
-
[32]
Phantom: General backdoor attacks on retrieval augmented language genera- tion,
H. Chaudhari, G. Severi, J. Abascal, A. Suri, M. Jagielski, C. A. Choquette-Choo, M. Nasr, C. Nita-Rotaru, and A. Oprea, “Phantom: General backdoor attacks on retrieval augmented language genera- tion,”ACM Transactions on AI Security and Privacy, 2024
2024
-
[33]
Cpa-rag: Covert poisoning attacks on retrieval-augmented generation in large language models,
C. Li, J. Zhang, A. Cheng, Z. Ma, X. Li, and J. Ma, “Cpa-rag: Covert poisoning attacks on retrieval-augmented generation in large language models,”arXiv preprint arXiv:2505.19864, 2025
arXiv 2025
-
[34]
R. Sui, “Ctrlrag: Black-box adversarial attacks based on masked language models in retrieval-augmented language generation,”arXiv preprint arXiv:2503.06950, 2025
arXiv 2025
-
[35]
C. Choi, J. Kim, S. Cho, S. Jeong, and B. Chang, “The rag paradox: A black-box attack exploiting unintentional vulnerabilities in retrieval- augmented generation systems,”arXiv preprint arXiv:2502.20995, 2025
arXiv 2025
-
[36]
Confusedpilot: Confused deputy risks in rag-based llms,
A. RoyChowdhury, M. Luo, P. Sahu, S. Banerjee, and M. Tiwari, “Confusedpilot: Confused deputy risks in rag-based llms,”arXiv preprint arXiv:2408.04870, 2024
arXiv 2024
-
[37]
One shot dominance: Knowledge poisoning at- tack on retrieval-augmented generation systems,
Z. Chang, M. Li, X. Jia, J. Wang, Y . Huang, Z. Jiang, Y . Liu, and Q. Wang, “One shot dominance: Knowledge poisoning at- tack on retrieval-augmented generation systems,”arXiv preprint arXiv:2505.11548, 2025
Pith/arXiv arXiv 2025
-
[38]
{Topic-FlipRAG}:{Topic-Orientated}adversarial opinion manipulation attacks to{Retrieval-Augmented}generation models,
Y . Gong, Z. Chen, J. Liu, M. Chen, F. Yu, W. Lu, X. Wang, and X. Liu, “{Topic-FlipRAG}:{Topic-Orientated}adversarial opinion manipulation attacks to{Retrieval-Augmented}generation models,” in34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 3807–3826
2025
-
[39]
Typos that broke the rag’s back: Genetic attack on rag pipeline by simulating documents in the wild via low-level perturbations,
S. Cho, S. Jeong, J. Seo, T. Hwang, and J. C. Park, “Typos that broke the rag’s back: Genetic attack on rag pipeline by simulating documents in the wild via low-level perturbations,” inFindings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 2826–2844
2024
-
[40]
J. Wang and F. Yu, “Derag: Black-box adversarial attacks on multiple retrieval-augmented generation applications via prompt injection,” arXiv preprint arXiv:2507.15042, 2025
arXiv 2025
-
[41]
Editing factual knowledge in lan- guage models,
N. De Cao, W. Aziz, and I. Titov, “Editing factual knowledge in lan- guage models,” inProceedings of the 2021 conference on empirical methods in natural language processing, 2021, pp. 6491–6506
2021
-
[42]
Editing large language models: Problems, methods, and opportunities,
Y . Yao, P. Wang, B. Tian, S. Cheng, Z. Li, S. Deng, H. Chen, and N. Zhang, “Editing large language models: Problems, methods, and opportunities,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 10 222–10 240
2023
-
[43]
Aging with grace: Lifelong model editing with dis- crete key-value adaptors,
T. Hartvigsen, S. Sankaranarayanan, H. Palangi, Y . Kim, and M. Ghassemi, “Aging with grace: Lifelong model editing with dis- crete key-value adaptors,”Advances in Neural Information Processing Systems, vol. 36, pp. 47 934–47 959, 2023
2023
-
[44]
Alphaedit: Null-space constrained knowledge editing for language models,
J. Fang, H. Jiang, K. Wang, Y . Ma, J. Shi, X. Wang, X. He, and T.- S. Chua, “Alphaedit: Null-space constrained knowledge editing for language models,” inInternational Conference on Learning Repre- sentations, vol. 2025, 2025, pp. 16 366–16 396
2025
-
[45]
Knowl- edge neurons in pretrained transformers,
D. Dai, L. Dong, Y . Hao, Z. Sui, B. Chang, and F. Wei, “Knowl- edge neurons in pretrained transformers,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 8493–8502
2022
-
[46]
Locating and editing factual associations in gpt,
K. Meng, D. Bau, A. Andonian, and Y . Belinkov, “Locating and editing factual associations in gpt,”Advances in neural information processing systems, vol. 35, pp. 17 359–17 372, 2022
2022
-
[47]
Mass-editing memory in a transformer,
K. Meng, A. S. Sharma, A. Andonian, Y . Belinkov, and D. Bau, “Mass-editing memory in a transformer,”arXiv preprint arXiv:2210.07229, 2022
Pith/arXiv arXiv 2022
-
[48]
E. Mitchell, C. Lin, A. Bosselut, C. Finn, and C. D. Manning, “Fast model editing at scale,”arXiv preprint arXiv:2110.11309, 2021
arXiv 2021
-
[49]
Memory-based model editing at scale,
E. Mitchell, C. Lin, A. Bosselut, C. D. Manning, and C. Finn, “Memory-based model editing at scale,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 15 817–15 831
2022
-
[50]
Gradient surgery for multi-task learning,
T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn, “Gradient surgery for multi-task learning,”Advances in neural infor- mation processing systems, vol. 33, pp. 5824–5836, 2020
2020
-
[51]
Conflict-averse gradient descent for multi-task learning,
B. Liu, X. Liu, X. Jin, P. Stone, and Q. Liu, “Conflict-averse gradient descent for multi-task learning,”Advances in neural information processing systems, vol. 34, pp. 18 878–18 890, 2021. Figure 7. Base retriever results on the financially sensitive retirement- account prompt. The base retriever returns benign competing passages related to IRA contribut...
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.