pith. machine review for the scientific record. sign in

arxiv: 2604.23750 · v2 · submitted 2026-04-26 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

The Override Gap: A Magnitude Account of Knowledge Conflict Failure in Hypernetwork-Based Instant LLM Adaptation

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:17 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords hypernetwork adaptationknowledge conflictLLM overridemagnitude scalinginstant internalizationDoc-to-LoRAprior strengthparameter-efficient fine-tuning
0
0 comments X

The pith

Hypernetwork adapters fail on knowledge conflicts because their fixed margin is outscaled by the pretrained model's growing margin on frequent facts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that instant document internalization via hypernetworks such as Doc-to-LoRA collapses on contradictory facts because the injected adapter maintains roughly constant magnitude while the base model's knowledge margin scales with training frequency. This produces a systematic override gap that widens on deeper priors, with baseline accuracy falling from 68 percent on weak conflicts to 16 percent on strong ones across 194 tested cases. The account is supported by the observation that the hypernetwork already routes to the correct layers, so the problem is not representational but one of relative amplitude. Two training-free interventions, Selective Layer Boosting that amplifies the adapter at its highest-norm layers and Conflict-Aware Internalization that applies boosting only when the base model is confident, close most of the gap and raise deep-conflict accuracy to 71 percent on Gemma-2B and 72.5 percent on Mistral-7B.

Core claim

The failure of hypernetwork-based instant adaptation on knowledge conflicts is a magnitude problem: the adapter margin stays approximately constant across documents while the pretrained margin grows with the base model's training frequency on the contradicted fact, so deep conflicts lose by construction; selectively scaling the adapter at its top-norm layers only when the base model assigns high probability to its original answer raises accuracy on the strongest priors from 46.4 percent to 71.0 percent on Gemma-2B and from 53.6 percent to 72.5 percent on Mistral-7B.

What carries the argument

The override gap between the constant adapter margin and the frequency-dependent pretrained margin, together with Selective Layer Boosting that multiplies the adapter scale at high-norm layers and Conflict-Aware Internalization that gates the boost on base-model confidence.

If this is right

  • Deep-conflict accuracy rises by roughly 24 points on Gemma-2B and 19 points on Mistral-7B while novel-knowledge recall remains intact.
  • The method outperforms vanilla retrieval-augmented generation by 18 points on medium-strength conflicts despite operating entirely inside parameter space.
  • The same magnitude gap should appear in any hypernetwork or low-rank adaptation scheme whose update norm does not grow with the strength of the fact being overridden.
  • A benchmark of 489 questions now separates novel recall, cross-knowledge combination, and prior-graded conflicts for systematic testing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If magnitude scaling proves general, other parameter-efficient adaptation techniques may need explicit margin calibration rather than purely representational alignment.
  • The approach could be extended to continual learning settings where successive documents arrive with varying conflict depths.
  • One testable extension is to replace the binary high-norm selection with a continuous weighting proportional to the per-layer margin gap observed at inference time.

Load-bearing premise

That the observed accuracy drop with increasing prior strength is caused by the magnitude mismatch rather than by differences in representation quality or optimization dynamics.

What would settle it

Measure whether selectively scaling only the top-norm layers of the adapter improves deep-conflict accuracy without reducing performance on non-conflict or novel-fact questions; if the improvement disappears when the same scale factor is applied uniformly instead of selectively, the magnitude account is supported.

Figures

Figures reproduced from arXiv: 2604.23750 by Mingwei Li, Shuaizhi Cheng, Xiang Shi.

Figure 1
Figure 1. Figure 1: Overview of the override gap. (1) A hypernetwork generates a per-layer LoRA adapter from a document, but the right signal reaches the right layers at insufficient amplitude to override pretrained knowledge. (2) The document fact wins only when the adapter margin ∆lora exceeds the pretrained margin ∆prior, which grows with prior strength. (3) Selective Layer Boosting (SLB) am￾plifies the most active adapter… view at source ↗
Figure 2
Figure 2. Figure 2: The override competition (Eq. (6)) across three conflict difficulty levels. The pretrained view at source ↗
Figure 3
Figure 3. Figure 3: Conflict-Aware routing. The base model is probed first with the question alone. A confi view at source ↗
Figure 4
Figure 4. Figure 4: Conflict accuracy on KID-Bench v2 across three difficulty levels. The baseline gradient view at source ↗
Figure 5
Figure 5. Figure 5: Cross-model validation of Conflict-Aware internalization. Novel recall is preserved across view at source ↗
Figure 6
Figure 6. Figure 6: Global versus selective amplification on SQuAD (200 samples, official pipeline). Global view at source ↗
Figure 7
Figure 7. Figure 7: Dose-response of conflict accuracy vs. β at k=25% on Gemma-2B (circles) and Mistral￾7B (triangles), with novel recall (squares). The logistic fit (solid) follows Eq. (6); novel recall remains near ceiling. training statement; what the data actually supports is that ∆lora is independent of the specific ∆prior for the same question, which is what the theory needs view at source ↗
Figure 8
Figure 8. Figure 8: Per-question ∆prior vs. ∆lora on 194 KID-Bench conflict questions, colored by difficulty level. The diagonal separates override (circles above) from non-override (crosses below). All 194 points fall on the predicted side, confirming Eq. (6) without free parameters. zero (|r| ≈ 0.14), and partitioning the items at Zipf 5.0 yields only a 11 percentage point baseline￾accuracy split (64% on common-word answers… view at source ↗
Figure 9
Figure 9. Figure 9: Accuracy vs. prior strength on 194 conflict questions, using the base model’s log view at source ↗
Figure 10
Figure 10. Figure 10: Linear interpolation between two documents’ LoRA adapters produces a smooth behav view at source ↗
Figure 11
Figure 11. Figure 11: Causal layer intervention on 20 C-deep conflicts. Zeroing any layer group drops override view at source ↗
Figure 12
Figure 12. Figure 12: Per-layer causal importance on 69 Gemma C-deep questions. Three middle layers (8, 12, view at source ↗
Figure 13
Figure 13. Figure 13: Distribution of minimum β needed to override each of 194 conflict questions, grouped by difficulty level. Mean β rises with prior strength (1.26 on C-light to 1.50 on C-deep). Two observations follow. First, the mean required β rises monotonically from 1.26 on C-light to 1.50 on C-deep. This is the predicted direction of the magnitude account and rules out the null hypothesis that difficulty-level labels … view at source ↗
Figure 14
Figure 14. Figure 14: Ablation grid over SLB parameters. The left heatmap reports conflict accuracy and the view at source ↗
Figure 15
Figure 15. Figure 15: Per-layer norm products sl = ∥Al∥F · ∥Bl∥F across 20 documents for Gemma-2B and Mistral-7B, normalized per document. The top-sl layers form a stable band across documents, justifying a fixed top-k selection rule. 10 Discussion 10.1 What the Improvements Reveal About the Method The fact that a post processing multiplication on a subset of layers recovers most of the conflict accuracy gap has a specific int… view at source ↗
read the original abstract

Hypernetwork-based methods such as Doc-to-LoRA internalize a document into an LLM's weights in a single forward pass, but they fail systematically on conflicts: when the document contradicts pretraining knowledge, accuracy collapses to 46.4% on the deepest facts. We show the failure is a magnitude problem rather than a representational one. The hypernetwork already targets the right layers, but its adapter margin is approximately constant across documents while the pretrained margin grows with training frequency, so deep conflicts lose by construction. The account predicts that failure should track prior strength: sorting 194 conflicts by the base model's log-probability on the contradicted fact, baseline accuracy falls from 68% on weak-prior questions to 16% on strong-prior ones, a 52 percentage-point gap. The cure is amplitude. Selective Layer Boosting scales the adapter at its top-norm layers, and Conflict-Aware Internalization triggers boosting only when the base model is confident. Both are training-free; together they raise deep-conflict accuracy from 46.4% to 71.0% on Gemma-2B and from 53.6% to 72.5% on Mistral-7B while preserving novel-knowledge recall, and beat vanilla retrieval-augmented generation on medium conflicts by 18 percentage points despite operating entirely in parameter space. We release KID-Bench, a 489-question benchmark that separates novel recall, cross-knowledge combination, and prior-graded conflicts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes systematic failures of hypernetwork-based instant adaptation (e.g., Doc-to-LoRA) when documents contradict pretrained knowledge. It claims the root cause is a magnitude mismatch rather than a representational one: hypernetwork adapters produce approximately constant margins while pretrained margins grow with log-frequency of the contradicted fact. Evidence consists of a 52-point accuracy drop (68% to 16%) when 194 conflicts are sorted by base-model log-probability on the contradicted fact, plus two training-free interventions (Selective Layer Boosting and Conflict-Aware Internalization) that raise deep-conflict accuracy from 46.4% to 71.0% on Gemma-2B and 53.6% to 72.5% on Mistral-7B while preserving novel-knowledge recall and outperforming vanilla RAG on medium conflicts by 18 points. The paper also releases the KID-Bench benchmark separating novel recall, cross-knowledge combination, and prior-graded conflicts.

Significance. If the magnitude account is correct, the work supplies a parsimonious, training-free explanation and remedy for a recurring failure mode in parameter-space adaptation, together with a useful benchmark that disentangles different knowledge-use regimes. The reported gains are practically relevant and the interventions are simple to implement. However, the significance is reduced by the indirect character of the supporting evidence; direct margin measurements are absent, leaving open the possibility that the accuracy-prior correlation and boosting gains arise from unmeasured confounds such as representation quality or optimization dynamics.

major comments (2)
  1. [Abstract] Abstract and Results: The central claim that 'the hypernetwork already targets the right layers' but merely lacks sufficient scale is not directly tested. No layer-wise comparison of generated ΔW to an ideal fine-tune delta, nor cosine alignment or activation overlap on conflict tokens, is reported across weak vs. strong priors; the success of top-norm boosting could therefore reflect correction of noisy layer selection rather than pure magnitude rescue.
  2. [Experiments] Experiments (sorting and boosting results): The 52-point gap and post-boosting lifts are consistent with the magnitude hypothesis, yet the manuscript provides neither direct quantification of adapter margin (e.g., ||ΔW|| or logit margin on contradicted facts) versus pretrained margin before/after intervention, nor ablations on the boosting threshold or the log-prob sorting cutoff. Without these, the causal link between magnitude mismatch and failure remains unestablished and alternative explanations (representation quality, optimization dynamics) cannot be excluded.
minor comments (2)
  1. [Abstract] The abstract states accuracy figures without error bars or confidence intervals; adding these (and reporting the number of runs) would strengthen the quantitative claims.
  2. The definition of 'adapter margin' and 'pretrained margin' should be stated explicitly in the main text with a short equation or operational description rather than left implicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the practical value of the interventions and KID-Bench benchmark. We address the two major comments point by point below, acknowledging where the evidence is indirect and proposing concrete revisions to strengthen the causal claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract and Results: The central claim that 'the hypernetwork already targets the right layers' but merely lacks sufficient scale is not directly tested. No layer-wise comparison of generated ΔW to an ideal fine-tune delta, nor cosine alignment or activation overlap on conflict tokens, is reported across weak vs. strong priors; the success of top-norm boosting could therefore reflect correction of noisy layer selection rather than pure magnitude rescue.

    Authors: We acknowledge that the claim of correct layer targeting rests on indirect evidence: the hypernetwork produces adapters whose highest-norm layers, when selectively scaled, improve conflict accuracy while preserving novel recall. We do not provide direct layer-wise comparisons of generated ΔW to an ideal fine-tune delta, cosine alignments, or activation overlaps on conflict tokens stratified by prior strength. Consequently, it remains possible that top-norm boosting partially corrects for noisy layer selection rather than acting purely through magnitude. In the revised manuscript we will add an appendix containing layer-wise norm comparisons between hypernetwork adapters and standard LoRA fine-tunes on a subset of conflicts, together with activation-overlap statistics on conflict tokens for weak- versus strong-prior cases. These additions will help isolate magnitude from selection effects. revision: yes

  2. Referee: [Experiments] Experiments (sorting and boosting results): The 52-point gap and post-boosting lifts are consistent with the magnitude hypothesis, yet the manuscript provides neither direct quantification of adapter margin (e.g., ||ΔW|| or logit margin on contradicted facts) versus pretrained margin before/after intervention, nor ablations on the boosting threshold or the log-prob sorting cutoff. Without these, the causal link between magnitude mismatch and failure remains unestablished and alternative explanations (representation quality, optimization dynamics) cannot be excluded.

    Authors: We agree that the current support for the magnitude account is correlational and interventional rather than based on direct margin measurements. The 52-point accuracy gradient with log-probability and the gains from training-free boosting are consistent with the hypothesis and already rule out optimization dynamics as the sole cause, while preservation of novel recall makes broad representation-quality confounds less plausible. Nevertheless, explicit quantification of adapter norms and logit margins on contradicted facts, before and after boosting, together with threshold and cutoff ablations, would tighten the causal link. In the revision we will add: (i) ||ΔW||_F and logit-margin measurements on conflict facts pre- and post-intervention; (ii) ablations varying the boosting threshold (top-10 %, top-20 %, top-30 % norm layers); and (iii) sensitivity checks on the log-probability quantiles used to define weak/medium/strong conflicts. These results will appear in a new subsection of the Experiments section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical sorting and interventions are independent of hypernetwork outputs

full rationale

The paper's magnitude account is tested by sorting 194 conflicts using the base model's pre-adaptation log-probability on contradicted facts (an external measurement independent of the hypernetwork) and by applying training-free post-hoc interventions (Selective Layer Boosting and Conflict-Aware Internalization) whose gains are measured on held-out accuracy. No equations reduce a derived quantity to a fitted input by construction, no load-bearing self-citations appear, and no ansatz or uniqueness claim is smuggled in; the derivation chain remains self-contained against the KID-Bench benchmark and RAG baselines.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The account rests on two domain assumptions about layer targeting and margin constancy that are stated but not derived; no new entities are introduced and no free parameters are explicitly fitted in the abstract description.

axioms (2)
  • domain assumption The hypernetwork already targets the right layers for the adaptation task.
    Invoked to argue the failure is magnitude rather than representational.
  • domain assumption Adapter margin remains approximately constant across different documents.
    Central to the claim that pretrained margin growth causes the conflict loss.

pith-pipeline@v0.9.0 · 5573 in / 1490 out tokens · 66779 ms · 2026-05-12T03:17:33.936567+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 14 internal anchors

  1. [1]

    Qwen Technical Report

    Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. Qwen technical report.arXiv preprint arXiv:2309.16609, 2023

  2. [2]

    Buehler and Markus J

    Eric L. Buehler and Markus J. Buehler. X-LoRA: Mixture of low-rank adapter experts, a flexi- ble framework for large language models with applications in protein mechanics and molecular design.APL Machine Learning, 2024. arXiv:2402.07148

  3. [3]

    Cetin, S

    Rujikorn Charakorn, Edoardo Cetin, Shinnosuke Uesaka, and Robert Tjarko Lange. Doc-to- lora: Learning to instantly internalize contexts.arXiv preprint arXiv:2602.15902, 2026

  4. [4]

    arXiv preprint arXiv:2506.06105 , year=

    Rujikorn Charakorn et al. Text-to-LoRA: Instant transformer adaption.arXiv preprint arXiv:2506.06105, 2025

  5. [5]

    Training Verifiers to Solve Math Word Problems

    Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems.arXiv preprint arXiv:2110.14168, 2021

  6. [6]

    2023 , month = jul, journal =

    Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, and Mor Geva. Evaluating the ripple effects of knowledge editing in language models.Transactions of the Association for Compu- tational Linguistics, 12:283–298, 2024. arXiv:2307.12976

  7. [7]

    arXiv preprint arXiv:2104.08696 , year=

    Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, and Furu Wei. Knowledge neu- rons in pretrained transformers. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022. arXiv:2104.08696

  8. [8]

    A learned representation for artistic style

    Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. A learned representation for artistic style.International Conference on Learning Representations (ICLR), 2017. arXiv:1610.07629. 28

  9. [9]

    Toy Models of Superposition

    Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan, Dario Amodei, Martin Wattenberg, and Christopher Olah. Toy models of superposition.Transformer Circuits Thread, Anthropic, 2022. arXiv:2209.10652

  10. [10]

    Gemma: Open Models Based on Gemini Research and Technology

    Google DeepMind Gemma Team. Gemma: Open models based on Gemini research and tech- nology.arXiv preprint arXiv:2403.08295, 2024

  11. [11]

    2023 , month = oct, journal =

    Mor Geva, Jasmijn Bastings, Katja Filippova, and Amir Globerson. Dissecting recall of factual associations in auto-regressive language models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023. arXiv:2304.14767

  12. [12]

    Transformer Feed-Forward Layers Are Key-Value Memories

    Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer feed-forward layers are key-value memories. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. arXiv:2012.14913

  13. [13]

    David Ha, Andrew Dai, and Quoc V . Le. HyperNetworks. InInternational Conference on Learning Representations (ICLR), 2017. arXiv:1609.09106

  14. [14]

    LoRA: Low-Rank Adaptation of Large Language Models

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InInter- national Conference on Learning Representations (ICLR), 2022. arXiv:2106.09685

  15. [15]

    arXiv preprint arXiv:2307.13269 , year=

    Chengsong Huang, Qian Liu, Bill Yuchen Lin, Tianyu Pang, Chao Du, and Min Lin. Lo- RAHub: Efficient cross-task generalization via dynamic LoRA composition. InConference on Language Modeling (COLM), 2024. arXiv:2307.13269

  16. [16]

    Transformer-Patcher: One mistake worth one neuron

    Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, and Zhang Xiong. Transformer-Patcher: One mistake worth one neuron. InInternational Conference on Learning Representations (ICLR), 2023. arXiv:2301.09785

  17. [17]

    Few-shot learning with retrieval augmented language models,

    Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, and Edouard Grave. Atlas: Few-shot learning with retrieval augmented language models.Journal of Machine Learning Research, 24(251):1–43, 2023. arXiv:2208.03299

  18. [18]

    Perceiver: General perception with iterative attention

    Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, and Joao Carreira. Perceiver: General perception with iterative attention. InInternational Conference on Machine Learning (ICML), 2021. arXiv:2103.03206

  19. [19]

    Mistral 7B

    Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, et al. Mistral 7B.arXiv preprint arXiv:2310.06825, 2023

  20. [20]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K ¨uttler, Mike Lewis, Wen-tau Yih, Tim Rockt ¨aschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAd- vances in Neural Information Processing Systems (NeurIPS), 2020. arXiv:2005.11401

  21. [21]

    HyperLoRA: Parameter-efficient adaptive generation for portrait synthesis

    Mengtian Li, Jinshu Chen, Wanquan Feng, Bingchuan Li, Fei Dai, Songtao Zhao, and Qian He. HyperLoRA: Parameter-efficient adaptive generation for portrait synthesis. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

  22. [22]

    TruthfulQA: Measuring How Models Mimic Human Falsehoods

    Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring how models mimic human falsehoods. InProceedings of the 60th Annual Meeting of the Association for Compu- tational Linguistics (ACL), 2022. arXiv:2109.07958

  23. [23]

    F., Cheng, K.-T., and Chen, M.-H

    Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, and Min-Hung Chen. DoRA: Weight-decomposed low-rank adaptation. InInternational Conference on Machine Learning (ICML), 2024. arXiv:2402.09353

  24. [24]

    Shine: A scalable in-context hypernetwork for mapping context to lora in a single pass.arXiv preprint arXiv:2602.06358, 2026

    Yewei Liu, Xiyuan Wang, Yansheng Mao, Yoav Gelbery, Haggai Maron, and Muhan Zhang. SHINE: A scalable in-context hypernetwork for mapping context to LoRA in a single pass. arXiv preprint arXiv:2602.06358, 2026. 29

  25. [25]

    arXiv preprint arXiv:2202.05262 , year=

    Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in GPT. InAdvances in Neural Information Processing Systems (NeurIPS), 2022. arXiv:2202.05262

  26. [26]

    Mass-Editing Memory in a Transformer

    Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, and David Bau. Mass- editing memory in a transformer. InInternational Conference on Learning Representations (ICLR), 2023. arXiv:2210.07229

  27. [27]

    Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, and Christopher D. Manning. Fast model editing at scale. InInternational Conference on Learning Representations (ICLR), 2022. arXiv:2110.11309

  28. [28]

    Manning, and Chelsea Finn

    Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, and Chelsea Finn. Memory-based model editing at scale. InInternational Conference on Machine Learning (ICML), 2022. arXiv:2206.06520

  29. [29]

    FiLM: Visual Reasoning with a General Conditioning Layer

    Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron Courville. FiLM: Visual reasoning with a general conditioning layer. InProceedings of the AAAI Conference on Artificial Intelligence, 2018. arXiv:1709.07871

  30. [30]

    SQuAD: 100,000+ Questions for Machine Comprehension of Text

    Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100,000+ ques- tions for machine comprehension of text. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016. arXiv:1606.05250

  31. [31]

    LoRA.rar: Learning to merge LoRAs via hypernetworks for subject-style conditioned image generation

    Donald Shenaj, Ondrej Bohdal, Mete Ozay, Pietro Zanuttigh, and Umberto Michieli. LoRA.rar: Learning to merge LoRAs via hypernetworks for subject-style conditioned image generation. InProceedings of the International Conference on Computer Vision (ICCV), 2025. arXiv:2412.05148

  32. [32]

    ConflictBank: A benchmark for evaluating the influence of knowledge conflicts in large language models

    Zhaochen Su, Jun Zhang, et al. ConflictBank: A benchmark for evaluating the influence of knowledge conflicts in large language models. InAdvances in Neural Information Processing Systems, Datasets and Benchmarks Track, 2024. arXiv:2408.12076

  33. [33]

    Attention Is All You Need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Infor- mation Processing Systems (NeurIPS), 2017. arXiv:1706.03762

  34. [34]

    Vision as LoRA,

    Han Wang et al. Vision as LoRA.arXiv preprint arXiv:2503.20680, 2025

  35. [35]

    Mixture of lora experts,

    Xun Wu, Shaohan Huang, and Furu Wei. Mixture of LoRA experts. InInternational Confer- ence on Learning Representations (ICLR), 2024. arXiv:2404.13628

  36. [36]

    Adaptive chameleon or stubborn sloth: Revealing the behavior of large language models in knowl- edge conflicts.arXiv preprint arXiv:2305.13300,

    Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, and Yu Su. Adaptive chameleon or stubborn sloth: Revealing the behavior of large language models in knowledge conflicts. InInternational Conference on Learning Representations (ICLR), 2024. arXiv:2305.13300

  37. [37]

    Knowledge conflicts for llms: A survey.arXiv:2403.08319, 2024

    Rongwu Xu, Zehan Qi, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, and Wei Xu. Knowledge conflicts for LLMs: A survey. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. arXiv:2403.08319

  38. [38]

    AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

    Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis, Pengcheng He, Yu Cheng, Weizhu Chen, and Tuo Zhao. AdaLoRA: Adaptive budget allocation for parameter- efficient fine-tuning. InInternational Conference on Learning Representations (ICLR), 2023. arXiv:2303.10512. A KID-Bench Construction Details A.1 Knowledge Source and Authoring KID-Ben...