YARD is a training-free method using Y-shaped decoder architecture and register tokens to improve contrastive decoding for hallucination reduction in LVLMs with lower latency.
arXiv preprint arXiv:2509.25177 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4representative citing papers
ADAPT reduces MLLM hallucinations 40-60% by aligning cross-attention dynamics via visual anchors, supervised inference, and preference tuning while preserving general capabilities.
Presents GranFact benchmark with expert annotations and a reliability-prioritized DPO method to improve fine-grained yet reliable generation in MLLMs.
Fox detects risky attention heads in LVLMs using visual attention entropy and severs hallucination shortcuts via numerical logit saturation and conflict-gated decoding, outperforming prior methods by 29.1%.
citing papers explorer
-
YARD: Y-Architecture Register Decoding for Efficient Hallucination Mitigation in Large Vision-Language Models
YARD is a training-free method using Y-shaped decoder architecture and register tokens to improve contrastive decoding for hallucination reduction in LVLMs with lower latency.
-
ADAPT: Attention Dynamics Alignment with Preference Tuning for Faithful MLLMs
ADAPT reduces MLLM hallucinations 40-60% by aligning cross-attention dynamics via visual anchors, supervised inference, and preference tuning while preserving general capabilities.
-
Reliability-Prioritized Fine-Grained Generation in Multimodal Large
Presents GranFact benchmark with expert annotations and a reliability-prioritized DPO method to improve fine-grained yet reliable generation in MLLMs.
-
Dismantling Pathological Shortcuts: A Causal Framework for Faithful LVLM Decoding
Fox detects risky attention heads in LVLMs using visual attention entropy and severs hallucination shortcuts via numerical logit saturation and conflict-gated decoding, outperforming prior methods by 29.1%.