Chain of Evidence introduces a retriever-agnostic visual attribution method for iRAG that reasons over document screenshots with VLMs to output precise bounding boxes, outperforming text baselines on Wiki-CoE and SlideVQA.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
MESA reduces hallucinations in LVLMs via controlled selective latent intervention that preserves the original token distribution.
Vision-language models for wellbeing assessment exhibit dataset-dependent performance and demographic biases, with explainability interventions providing inconsistent fairness gains at potential accuracy costs.
This reflection paper argues that synchronizing AR experiences with the physical environment through generative AI creates more natural and reactive mobile AR applications.
citing papers explorer
-
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation
Chain of Evidence introduces a retriever-agnostic visual attribution method for iRAG that reasons over document screenshots with VLMs to output precise bounding boxes, outperforming text baselines on Wiki-CoE and SlideVQA.
-
Mitigating Entangled Steering in Large Vision-Language Models for Hallucination Reduction
MESA reduces hallucinations in LVLMs via controlled selective latent intervention that preserves the original token distribution.
-
FAIR_XAI: Improving Multimodal Foundation Model Fairness via Explainability for Wellbeing Assessment
Vision-language models for wellbeing assessment exhibit dataset-dependent performance and demographic biases, with explainability interventions providing inconsistent fairness gains at potential accuracy costs.
-
Synchronized Realities: Towards Magic Mobile Experiences through Aligned AR
This reflection paper argues that synchronizing AR experiences with the physical environment through generative AI creates more natural and reactive mobile AR applications.