Med-StepBench is the first large-scale step-wise hallucination benchmark for 3D oncological PET/CT that decomposes clinical reasoning into four stages and reveals systematic VLM failures hidden by aggregate metrics.
M3d: Advancing 3d medical image analysis with multi-modal large language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Med-StepBench: A Hierarchical Reasoning Framework for Evaluating Hallucinations in Medical Vision-Language Models
Med-StepBench is the first large-scale step-wise hallucination benchmark for 3D oncological PET/CT that decomposes clinical reasoning into four stages and reveals systematic VLM failures hidden by aggregate metrics.