VLMs exhibit sharply higher counterfactual hallucination rates in Arabic and dialects despite high true-statement accuracy, revealed by the new M²CQA benchmark and CFHR metric.
A survey of multimodal hallucination evaluation and detection
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models
VLMs exhibit sharply higher counterfactual hallucination rates in Arabic and dialects despite high true-statement accuracy, revealed by the new M²CQA benchmark and CFHR metric.
- TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design