arXiv preprint arXiv:2512.24271 , year=

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation , author= · arXiv 2512.24271

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

MultiToP: Learning to Patch Visual Tokens to Mitigate Hallucinations in Video Large Multimodal Models

cs.CV · 2026-06-10 · unverdicted · novelty 5.0

MultiToP mitigates hallucinations in video multimodal models by training a Visual Token Patcher with information-guided rank calibration to selectively replace unreliable tokens, yielding 50.60% F1 gain on Vript-HAL and 18.58% accuracy gain on ActivityNet-QA.

citing papers explorer

Showing 1 of 1 citing paper.

MultiToP: Learning to Patch Visual Tokens to Mitigate Hallucinations in Video Large Multimodal Models cs.CV · 2026-06-10 · unverdicted · none · ref 23
MultiToP mitigates hallucinations in video multimodal models by training a Visual Token Patcher with information-guided rank calibration to selectively replace unreliable tokens, yielding 50.60% F1 gain on Vript-HAL and 18.58% accuracy gain on ActivityNet-QA.

arXiv preprint arXiv:2512.24271 , year=

fields

years

verdicts

representative citing papers

citing papers explorer