Centroid erasure shows language representations overshadow vision in multimodal models, and text-centroid contrastive decoding recovers substantial accuracy on visual reasoning tasks.
Sdcd: Structure-disrupted contrastive decoding for mitigating hallucinations in large vision-language models.arXiv preprint arXiv:2601.03500
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CHASD is an inference-time framework that gates contrastive decoding via an uncertainty threshold and constructs negative branches through attention-guided perturbations of salient visual tokens to mitigate hallucinations in LVLMs.
citing papers explorer
-
The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models
Centroid erasure shows language representations overshadow vision in multimodal models, and text-centroid contrastive decoding recovers substantial accuracy on visual reasoning tasks.
-
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
CHASD is an inference-time framework that gates contrastive decoding via an uncertainty threshold and constructs negative branches through attention-guided perturbations of salient visual tokens to mitigate hallucinations in LVLMs.