arxiv: 2605.12258 · v1 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Instruction Lens Score: Your Instruction Contributes a Powerful Object Hallucination Detector for Multimodal Large Language Models

Jinlun Ye, Ruixuan Wang, Runhe Lai, Weijiang Yu, Xinhua Lu, Yanqi Wu

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:47 UTC · model grok-4.3

classification 💻 cs.LG

keywords object hallucinationmultimodal large language modelsinstruction token embeddingshallucination detectionplug-and-play method

0 comments

The pith

Instruction token embeddings can detect object hallucinations in multimodal LLMs without extra models or training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that instruction tokens in multimodal large language models carry visual information that helps screen out false objects generated from misleading image features. From this observation the authors derive the Instruction Lens Score, which scores potential hallucinations by combining local calibration of token embeddings with consistency checks across the surrounding context. This matters for reliable deployment of vision-language models because object hallucinations remain common and prior detectors often require separate models or fine-tuning. The method runs as a lightweight add-on at inference time and shows stronger performance than existing approaches across several benchmarks and model families.

Core claim

Instruction token embeddings implicitly encode visual information while filtering erroneous signals from misleading visual embeddings. The Instruction Lens Score formed by a Calibrated Local Score and a Context Consistency Score applied to object tokens therefore serves as an effective plug-and-play detector of object hallucinations.

What carries the argument

The Instruction Lens Score (InsLen), which measures hallucination risk directly from instruction token embeddings using calibrated local similarity and context-consistency checks on generated object tokens.

If this is right

The detector applies to many different MLLM architectures with no architecture-specific changes.
No auxiliary models or retraining are needed at deployment.
Both local embedding calibration and global context consistency contribute to the final score.
The approach improves detection accuracy over prior methods on standard hallucination benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Prompt design that strengthens instruction tokens could further reduce hallucinations at the source.
The same embedding lens might be tested on attribute or relation hallucinations beyond objects.
Real-time application of InsLen could enable on-the-fly output correction during generation.

Load-bearing premise

Instruction token embeddings encode enough visual information to filter errors introduced by the visual stream, and the two proposed scores reliably indicate whether an object name is hallucinated.

What would settle it

If InsLen scores fail to correlate with human-verified object presence on a new image-prompt dataset where ground-truth objects are exhaustively labeled, the detection method does not work.

Figures

Figures reproduced from arXiv: 2605.12258 by Jinlun Ye, Ruixuan Wang, Runhe Lai, Weijiang Yu, Xinhua Lu, Yanqi Wu.

**Figure 1.** Figure 1: Illustration of instruction embeddings filtering misleading visual information. By applying the Logit Lens to intermediate embeddings, we observe that instruction embeddings (green) consistently assign higher confidence to image-grounded concepts (e.g., people, snow, and ski), while suppressing hallucinated objects (e.g., bag). See more qualitative examples in Appendix A.4. models generate responses tha… view at source ↗

**Figure 2.** Figure 2: Distributions of the log-transformed internal confidence score assigned to hallucinated objects and real objects with LLaVA1.5, where the MSCOCO dataset is used. (a) Confidence score distributions derived from image embeddings; (b) Confidence score distributions derived from instruction embeddings. More results on different models are provided in Appendix [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Performance comparison of internal confidence derived from image embeddings (green) and instruction embeddings (blue) across different MLLMs on the MSCOCO benchmark. the vocabulary size of the language model. Tokens with the highest probability are then interpreted as semantic concepts encoded in the corresponding embeddings. Local Similarity Score. The Local Similarity Score (Park & Li, 2025) is proposed … view at source ↗

**Figure 4.** Figure 4: Overview of the proposed Instruction Lens score (InsLen) for object hallucination detection. The InsLen score consists of two components: Calibrated Local Score and Context Consistency Score (top left). Top right (Calibrated Local Score Scls): the maximum confidence Scafe assigned to the object token from instruction embeddings is computed to calibrate the vision-based score Slocal. Bottom right (Context C… view at source ↗

**Figure 5.** Figure 5: Sensitivity studies of hyperparameter ω and the number m of selected instruction embeddings for InsLen score on the MSCOCO dataset. Dashed lines indicate the strongest baselines for LLaVA-1.5 and Qwen3-VL, respectively [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of the Local Similarity Score (LSS) distributions before and after calibration for LLaVA-1.5 on MSCOCO [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Sensitivity study of the decoder layer for instruction embedding extraction. Index ‘0’ corresponds to the input embedding layer. 0 10 20 30 τ 80 82 84 86 88 90 AUROC (%) LLaVA-1.5 0 10 20 30 τ 60 65 70 75 80 85 90 Qwen3-VL Baseline (GLSIM) Baseline (SVAR) τ = 0.1 0 2 4 6 α 80 82 84 86 88 90 LLaVA-1.5 0 2 4 6 α 60 65 70 75 80 85 90 Qwen3-VL Baseline (GLSIM) Baseline (SVAR) 2 4 6 k 82 83 84 85 86 LLaVA-1.5 2… view at source ↗

**Figure 8.** Figure 8: Sensitivity studies of temperature hyperparameters τ , α, and the top-k choice in Calibration Confidence. On the other hand, the Calibration Confidence used in the Calibrated Local Score is defined as the maximum probability assigned to the object token across all instruction embeddings. We further investigate an alternative design that uses the average confidence over the top-k instruction embeddings with… view at source ↗

**Figure 9.** Figure 9: Density distributions of log-transformed confidence scores derived from image embeddings (a) and instruction embeddings (b), where log transformation is used for clearer comparison. A.5. Case study of object detection. We present qualitative examples of real-time deployment of our InsLen score for object hallucination detection in Figures 12 and 13. The decision thresholds defined in Section 3 are determin… view at source ↗

**Figure 10.** Figure 10: Input prompt for CLEVR evaluation (Query Attribute). POPE. POPE (Li et al., 2023) proposes a benchmark for evaluating object hallucination in LVLM. It formulates hallucination evaluation as a binary probing task, where models are prompted with Yes/No questions about the presence of specific objects in an image. The benchmark further supports multiple object sampling strategies—Random, Popular, and Advers… view at source ↗

**Figure 11.** Figure 11: Input prompt for POPE evaluation. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: Qualitative comparison between InsLen and GLSIM on object hallucination detection where the LLaVA-1.5-7B model is used. In the generated responses (right), ground-truth objects are shown in green and hallucinated objects in orange. Detection results are indicated by green (real) and orange (hallucinated), and incorrect predictions are marked with a ✗. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: Qualitative comparison between InsLen and GLSIM on object hallucination detection where the Qwen3-VL model is used. In the generated responses (right), ground-truth objects are shown in green and hallucinated objects in orange. Detection results are indicated by green (real) and orange (hallucinated), and incorrect predictions are marked with a ✗. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗

**Figure 14.** Figure 14: Visualization of the instruction embeddings in LLaVA-Oneversion-1.5, where the input instruction (e.g., “Please describe the image in detail.”) is projected back to the vocabulary space. For each instruction token position, we report the top-20 vocabulary tokens with the highest prediction scores, revealing the semantic distribution induced by the instruction embeddings. Notably, several instruction-relat… view at source ↗

**Figure 15.** Figure 15: Visualization of the instruction embeddings in LLaVA-1.5. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_15.png] view at source ↗

**Figure 16.** Figure 16: Visualization of the instruction embeddings in Qwen3-VL. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_16.png] view at source ↗

read the original abstract

Multimodal large language models (MLLMs) have achieved remarkable progress, yet the object hallucination remains a critical challenge for reliable deployment. In this paper, we present an in-depth analysis of instruction token embeddings and reveal that they implicitly encode visual information while effectively filtering erroneous information introduced by misleading visual embeddings. Building on this insight, we propose the Instruction Lens Score (InsLen), which combines a Calibrated Local Score with a Context Consistency Score that measures context consistency of the object tokens. The proposed approach serves as a plug-and-play object hallucination detector without relying on auxiliary models or additional training. Extensive experiments across multiple benchmarks and diverse MLLM architectures demonstrate that InsLen consistently outperforms existing hallucination detection methods, highlighting its effectiveness and robustness. The code is available at https://github.com/Fraserlairh/Instruction-Lens-Score.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

InsLen gives a training-free hallucination detector from instruction embeddings, but the justification for why the scores isolate object hallucinations is thin and unablated.

read the letter

The main takeaway is that this paper introduces InsLen, a detector that derives a calibrated local score and a context consistency score directly from instruction token embeddings to flag object hallucinations in MLLMs. It is presented as plug-and-play with no extra models or training required, and the abstract reports consistent gains over prior methods across benchmarks and architectures, plus a code release that helps reproducibility checks. That combination is the actual novelty: shifting focus to the instruction side rather than output consistency or visual token analysis alone. The simplicity is a genuine strength for anyone who wants something lightweight to deploy. The soft spots sit in the central assumption. The claim that instruction embeddings implicitly encode useful visual information while filtering misleading visual signals is stated but not derived or ablated in detail. Nothing shows why the specific local-plus-consistency combination isolates object hallucination instead of other inconsistencies such as syntax or general factual drift. Calibration could easily introduce dependence on the evaluation data or tokenizer choices, which would weaken the cross-architecture robustness claim if prompt length or model-specific tokenization shifts the scores. The stress-test concern about untested confounds looks on target without further evidence. This is for researchers working on practical MLLM reliability tools who need zero-shot detectors. A reader would get value from the method if the experiments include proper ablations and error breakdowns, but the current framing leaves the mechanism under-justified. I would send it to peer review. The empirical side is worth referee scrutiny even if the analysis needs tightening.

Referee Report

3 major / 2 minor

Summary. The manuscript analyzes instruction token embeddings in multimodal large language models (MLLMs), claiming they implicitly encode visual information while filtering erroneous details from misleading visual embeddings. It proposes the Instruction Lens Score (InsLen) as the sum of a Calibrated Local Score and a Context Consistency Score for object hallucination detection. The approach is presented as plug-and-play, requiring no auxiliary models or training, and is reported to outperform prior methods across multiple benchmarks and diverse MLLM architectures, with code released publicly.

Significance. If the empirical claims hold after addressing the definitional and validation gaps, this would represent a useful contribution by providing a lightweight, training-free detector that leverages existing model internals to improve MLLM reliability. The public code release is a clear strength supporting reproducibility.

major comments (3)

[§3] §3 (Method): The central claim that instruction token embeddings 'implicitly encode visual information while effectively filtering erroneous information' is load-bearing for the entire InsLen construction, yet no derivation, visualization, or ablation is provided to show that the Calibrated Local Score plus Context Consistency Score isolates object hallucination rather than other inconsistencies (e.g., syntactic or factual). Without this, the plug-and-play claim across architectures cannot be evaluated.
[§3.2] §3.2 (Calibrated Local Score definition): The term 'Calibrated' suggests a data-dependent step; explicit equations are needed to confirm whether any parameters or thresholds are fitted to the evaluation benchmarks or remain fixed and architecture-independent. If the former, it contradicts the 'no additional training' and 'plug-and-play' assertions.
[§5] §5 (Experiments): No ablation or sensitivity analysis is reported on the Context Consistency Score's dependence on context window size, tokenizer choice, or prompt length. Such tests are required to substantiate robustness across the claimed 'diverse MLLM architectures,' as tokenizer differences could introduce confounds that undermine cross-model outperformance.

minor comments (2)

[Abstract] The abstract would be clearer if it named the specific benchmarks and MLLM families used in the 'extensive experiments.'
[Abstract] Ensure the GitHub link in the abstract points to a repository containing the exact code and hyperparameters used for the reported results.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below with clarifications and commit to revisions that strengthen the presentation without altering the core claims.

read point-by-point responses

Referee: [§3] §3 (Method): The central claim that instruction token embeddings 'implicitly encode visual information while effectively filtering erroneous information' is load-bearing for the entire InsLen construction, yet no derivation, visualization, or ablation is provided to show that the Calibrated Local Score plus Context Consistency Score isolates object hallucination rather than other inconsistencies (e.g., syntactic or factual). Without this, the plug-and-play claim across architectures cannot be evaluated.

Authors: We acknowledge that the supporting evidence for the central claim in §3 can be strengthened. The manuscript includes an analysis of instruction token embeddings, but we agree that explicit visualizations and targeted ablations are needed to demonstrate isolation of object hallucinations. In the revised manuscript we will add t-SNE projections of instruction embeddings for hallucinated versus non-hallucinated cases and an ablation that removes the filtering component of the Calibrated Local Score, showing that performance drops specifically on object hallucination benchmarks while remaining stable on syntactic or factual inconsistency tasks. revision: yes
Referee: [§3.2] §3.2 (Calibrated Local Score definition): The term 'Calibrated' suggests a data-dependent step; explicit equations are needed to confirm whether any parameters or thresholds are fitted to the evaluation benchmarks or remain fixed and architecture-independent. If the former, it contradicts the 'no additional training' and 'plug-and-play' assertions.

Authors: The calibration step normalizes using fixed, pre-computed statistics (mean and standard deviation) of instruction token embeddings drawn from the model's original training distribution; no parameters are fitted to any evaluation benchmark. We will insert the full mathematical definition of the Calibrated Local Score in the revised §3.2, explicitly stating that the normalization constants are architecture-specific but benchmark-independent and require no training or tuning on test data. revision: yes
Referee: [§5] §5 (Experiments): No ablation or sensitivity analysis is reported on the Context Consistency Score's dependence on context window size, tokenizer choice, or prompt length. Such tests are required to substantiate robustness across the claimed 'diverse MLLM architectures,' as tokenizer differences could introduce confounds that undermine cross-model outperformance.

Authors: We agree that additional sensitivity analyses would better substantiate the robustness claims. In the revised experiments section we will report results for the Context Consistency Score under varied context window sizes, across the tokenizers native to each evaluated MLLM, and with prompt lengths ranging from short to extended. These ablations will confirm that the reported outperformance remains consistent and is not driven by tokenizer-specific artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation is self-contained from embedding analysis to defined scores

full rationale

The paper begins with an empirical observation on instruction token embeddings (that they encode visual information and filter errors), then directly defines InsLen as the sum of a Calibrated Local Score and Context Consistency Score. No equations or definitions reduce the final detector to a fitted parameter on the evaluation benchmarks, a self-citation chain, or a tautological renaming. The plug-and-play claim rests on the stated construction rather than on any input that is itself derived from the output metric. Experiments across architectures serve as external validation rather than circular confirmation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view yields no explicit free parameters, axioms, or invented entities; the method appears to operate directly on existing MLLM embeddings without new postulates.

pith-pipeline@v0.9.0 · 5466 in / 1024 out tokens · 40329 ms · 2026-05-13T05:47:52.336510+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
InsLen score S_Ins(o) = ω·S_cls(o) + (1−ω)·S_ccs(o) where S_cls fuses max instruction-embedding confidence with cosine similarity of object and image embeddings, and S_ccs uses ℓ2 distance to averaged high-confidence instruction embeddings.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
Logit Lens projection of instruction embeddings yields higher AUROC separation of hallucinated vs. real objects than image embeddings; no derivation from a single distinction or reciprocal cost.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 6 internal anchors

[1]

NeurIPS , pages=

Visual instruction tuning , author=. NeurIPS , pages=

work page
[2]

ICLR , year =

Deyao Zhu and Jun Chen and Xiaoqian Shen and Xiang Li and Mohamed Elhoseiny , title =. ICLR , year =

work page
[3]

Wenliang Dai and Junnan Li and Dongxu Li and Anthony Meng Huat Tiong and Junqi Zhao and Weisheng Wang and Boyang Li and Pascale Fung and Steven C. H. Hoi , title =. NeurIPS , year =

work page
[4]

NeurIPS , year =

Learning to instruct for visual instruction tuning , author=. NeurIPS , year =

work page
[5]

CVPR , pages=

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens , author=. CVPR , pages=

work page
[6]

ICLR , year =

Run Luo and Yunshui Li and Longze Chen and Wanwei He and Ting. ICLR , year =

work page
[7]

ICLR , year =

Fuxiao Liu and Kevin Lin and Linjie Li and Jianfeng Wang and Yaser Yacoob and Lijuan Wang , title =. ICLR , year =

work page
[8]

CVPR , pages=

Devils in middle layers of large vision-language models: Interpreting, detecting and mitigating object hallucinations via attention lens , author=. CVPR , pages=

work page
[9]

CVPR , pages =

Eunkyu Park and Minyeong Kim and Gunhee Kim , title =. CVPR , pages =

work page
[10]

NeurIPS , year =

Seongheon Park and Yixuan Li , title =. NeurIPS , year =

work page
[11]

ICLR , year =

Interpreting and editing vision-language representations to mitigate hallucinations , author=. ICLR , year =

work page
[12]

ICLR , year =

Yiyang Zhou and Chenhang Cui and Jaehong Yoon and Linjun Zhang and Zhun Deng and Chelsea Finn and Mohit Bansal and Huaxiu Yao , title =. ICLR , year =

work page
[13]

Nature , volume=

Detecting hallucinations in large language models using semantic entropy , author=. Nature , volume=

work page
[14]

ACL, Findings , pages=

Can we trust AI doctors? a survey of medical hallucination in large language and large vision-language models , author=. ACL, Findings , pages=

work page
[15]

Hallucination of Multimodal Large Language Models: A Survey

Hallucination of multimodal large language models: A survey , author=. arXiv preprint arXiv:2404.18930 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[16]

ICLR, , year =

Alexey Dosovitskiy and Lucas Beyer and Alexander Kolesnikov and Dirk Weissenborn and Xiaohua Zhai and Thomas Unterthiner and Mostafa Dehghani and Matthias Minderer and Georg Heigold and Sylvain Gelly and Jakob Uszkoreit and Neil Houlsby , title =. ICLR, , year =

work page
[17]

CVPR , pages =

Justin Johnson and Bharath Hariharan and Laurens van der Maaten and Li Fei. CVPR , pages =

work page
[18]

Evaluating Object Hallucination in Large Vision-Language Models , booktitle =

Yifan Li and Yifan Du and Kun Zhou and Jinpeng Wang and Wayne Xin Zhao and Ji. Evaluating Object Hallucination in Large Vision-Language Models , booktitle =

work page
[19]

ICLR , year =

Clement Neo and Luke Ong and Philip Torr and Mor Geva and David Krueger and Fazl Barez , title =. ICLR , year =

work page
[20]

2020 , howpublished =

nostalgebraist , title =. 2020 , howpublished =

work page 2020
[21]

AAAI , pages =

Anisha Gunjal and Jihan Yin and Erhan Bas , title =. AAAI , pages =

work page
[22]

EMNLP, Findings , pages =

Liqiang Jing and Ruosen Li and Yunmo Chen and Xinya Du , title =. EMNLP, Findings , pages =

work page
[23]

ICCV , pages=

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs , author=. ICCV , pages=

work page
[24]

CVPR , pages =

Haotian Liu and Chunyuan Li and Yuheng Li and Yong Jae Lee , title =. CVPR , pages =

work page
[25]

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Llava-onevision-1.5: Fully open framework for democratized multimodal training , author=. arXiv preprint arXiv:2509.23661 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[26]

2025 , journal=

Qwen3-VL Technical Report , author=. 2025 , journal=

work page 2025
[27]

ICLR , year =

Jiabo Ye and Haiyang Xu and Haowei Liu and Anwen Hu and Ming Yan and Qi Qian and Ji Zhang and Fei Huang and Jingren Zhou , title =. ICLR , year =

work page
[28]

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs , booktitle =

Lingchen Meng and Jianwei Yang and Rui Tian and Xiyang Dai and Zuxuan Wu and Jianfeng Gao and Yu. DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs , booktitle =

work page
[29]

GPT-4 Technical Report

Gpt-4 technical report , author=. arXiv preprint arXiv:2303.08774 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[30]

ACL , pages=

Beyond logit lens: Contextual embeddings for robust hallucination detection & grounding in vlms , author=. ACL , pages=

work page
[31]

Andrey Malinin and Mark J. F. Gales , title =. ICLR , year =

work page
[32]

Pixtral 12B

Pixtral 12B , author=. arXiv preprint arXiv:2410.07073 , year=

work page internal anchor Pith review arXiv
[33]

Object Hallucination in Image Captioning , booktitle =

Anna Rohrbach and Lisa Anne Hendricks and Kaylee Burns and Trevor Darrell and Kate Saenko , editor =. Object Hallucination in Image Captioning , booktitle =

work page
[34]

IJCV , volume=

Visual instruction tuning towards general-purpose multimodal large language model: A survey , author=. IJCV , volume=

work page
[35]

ACL, Findings , pages=

Aligning large multimodal models with factually augmented rlhf , author=. ACL, Findings , pages=

work page
[36]

CVPR , pages=

RLAIF-V: Open-source ai feedback leads to super gpt-4v trustworthiness , author=. CVPR , pages=

work page
[37]

ECCV , pages=

Microsoft coco: Common objects in context , author=. ECCV , pages=. 2014 , organization=

work page 2014
[38]

ICCV , pages=

Objects365: A large-scale, high-quality dataset for object detection , author=. ICCV , pages=

work page
[39]

EMNLP , pages =

Xuan Gong and Tianshi Ming and Xinpeng Wang and Zhihua Wei , title =. EMNLP , pages =

work page
[40]

NeurIPS , year =

Hoigi Seo and Dong Un Kang and Hyunjin Cho and Joohoon Lee and Se Young Chun , title =. NeurIPS , year =

work page
[41]

LLaMA: Open and Efficient Foundation Language Models , journal =

Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie. LLaMA: Open and Efficient Foundation Language Models , journal =. 2023 , eprinttype =

work page 2023
[42]

Mixtral of Experts

Mixtral of experts , author=. arXiv preprint arXiv:2401.04088 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[43]

ICML , pages=

Learning transferable visual models from natural language supervision , author=. ICML , pages=

work page
[44]

ICCV , pages=

Sigmoid loss for language image pre-training , author=. ICCV , pages=

work page
[45]

AAAI , year=

Convis: Contrastive decoding with hallucination visualization for mitigating hallucinations in multimodal large language models , author=. AAAI , year=

work page
[46]

CVPR , year=

Octopus: Alleviating hallucination via dynamic contrastive decoding , author=. CVPR , year=

work page
[47]

ICLRW , year=

Seeing is believing: Mitigating hallucination in large vision-language models via clip-guided decoding , author=. ICLRW , year=

work page
[48]

CVPR , year=

Mitigating object hallucinations in large vision-language models through visual contrastive decoding , author=. CVPR , year=

work page
[49]

CVPR , year=

Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback , author=. CVPR , year=

work page
[50]

ICML , year=

Halc: Object hallucination reduction via adaptive focal-contrast decoding , author=. ICML , year=

work page
[51]

Eliciting Latent Predictions from Transformers with the Tuned Lens

Eliciting latent predictions from transformers with the tuned lens , author=. arXiv preprint arXiv:2303.08112 , year=

work page internal anchor Pith review Pith/arXiv arXiv