The hidden life of tokens: Reducing hallucination of large vision-language models via visual information steering.arXiv preprint arXiv:2502.03628

Zhuowei Li, Haizhou Shi, Yunhe Gao, Di Liu, Zhenting Wang, Yuxiao Chen, Ting Liu, Long Zhao, Hao Wang, Dimitris N · 2025 · arXiv 2502.03628

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models

cs.CV · 2026-04-28 · conditional · novelty 7.0

Prefill-Time Intervention (PTI) reduces hallucinations in large vision-language models by applying a one-time modality-aware steering correction to the initial KV cache at the prefill stage rather than during autoregressive decoding.

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

cs.CV · 2025-11-13 · unverdicted · novelty 6.0

RUDDER creates a persistent visual anchor by extracting CARD from prefill residuals and modulating its injection via an adaptive Beta Gate, cutting CHAIR_S by 24.4% and CHAIR_i by 23.6% on average across LLaVA, Idefics2, InstructBLIP and Qwen2.5-VL with >96% throughput.

Not Blind but Silenced: Rebalancing Vision and Language via Adversarial Counter-Commonsense Equilibrium

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

ACE uses adversarial counter-commonsense perturbations on image tokens during decoding to suppress hallucinated linguistic priors while preserving stable visual signals in MLLMs.

Hallucination of Multimodal Large Language Models: A Survey

cs.CV · 2024-04-29 · accept · novelty 5.0

The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.

From Weights to Activations: Is Steering the Next Frontier of Adaptation?

cs.CL · 2026-04-15 · unverdicted · novelty 4.0

Steering is positioned as a distinct adaptation paradigm that uses targeted activation interventions for local, reversible behavioral changes without parameter updates.

When Language Overwrites Vision: Over-Alignment and Geometric Debiasing in Vision-Language Models

cs.CV · 2026-05-07 · 3 refs

GEASS: Gated Evidence-Adaptive Selective Caption Trust for Vision-Language Models

cs.CV · 2026-05-03 · 2 refs

citing papers explorer

Showing 7 of 7 citing papers.

Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models cs.CV · 2026-04-28 · conditional · none · ref 23
Prefill-Time Intervention (PTI) reduces hallucinations in large vision-language models by applying a one-time modality-aware steering correction to the initial KV cache at the prefill stage rather than during autoregressive decoding.
Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models cs.CV · 2025-11-13 · unverdicted · none · ref 11
RUDDER creates a persistent visual anchor by extracting CARD from prefill residuals and modulating its injection via an adaptive Beta Gate, cutting CHAIR_S by 24.4% and CHAIR_i by 23.6% on average across LLaVA, Idefics2, InstructBLIP and Qwen2.5-VL with >96% throughput.
Not Blind but Silenced: Rebalancing Vision and Language via Adversarial Counter-Commonsense Equilibrium cs.CV · 2026-05-11 · unverdicted · none · ref 19
ACE uses adversarial counter-commonsense perturbations on image tokens during decoding to suppress hallucinated linguistic priors while preserving stable visual signals in MLLMs.
Hallucination of Multimodal Large Language Models: A Survey cs.CV · 2024-04-29 · accept · none · ref 108
The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.
From Weights to Activations: Is Steering the Next Frontier of Adaptation? cs.CL · 2026-04-15 · unverdicted · none · ref 11
Steering is positioned as a distinct adaptation paradigm that uses targeted activation interventions for local, reversible behavioral changes without parameter updates.
When Language Overwrites Vision: Over-Alignment and Geometric Debiasing in Vision-Language Models cs.CV · 2026-05-07 · unreviewed · ref 10 · 3 links
GEASS: Gated Evidence-Adaptive Selective Caption Trust for Vision-Language Models cs.CV · 2026-05-03 · unreviewed · ref 5 · 2 links

The hidden life of tokens: Reducing hallucination of large vision-language models via visual information steering.arXiv preprint arXiv:2502.03628

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer