Introduces the UCSF-PDGM-VQA dataset of 2387 QA pairs from 473 glioma MRI studies and demonstrates that state-of-the-art VLMs exhibit modality collapse on multi-sequence 3D medical images.
arXiv preprint arXiv:2511.17803 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 5years
2026 5representative citing papers
JANUS conditions Vision Transformer embeddings on macro-radiomic priors via anatomically guided gating, reaching macro-AUROC 0.88 on an internal test set of 5082 cases and 0.87 on an external set of 2000 cases while improving calibration and reducing high-confidence false positives under domainshift
DCP-PD improves macro F1 scores on CT report generation benchmarks and introduces a hierarchical location-aware evaluation protocol that reveals ongoing challenges in pathology spatial grounding.
CT-IDP derives over 900 quantitative phenotypes from multi-organ CT segmentations and uses sparse logistic regression to classify diseases, achieving macro-AUCs of 0.897/0.877/0.780 on MERLIN/Duke-Abdomen/AMOS datasets and outperforming a DINOv3 vision transformer baseline.
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.
citing papers explorer
-
UCSF-PDGM-VQA: Visual Question Answering dataset for brain tumor MRI interpretation
Introduces the UCSF-PDGM-VQA dataset of 2387 QA pairs from 473 glioma MRI studies and demonstrates that state-of-the-art VLMs exhibit modality collapse on multi-sequence 3D medical images.
-
JANUS: Anatomy-Conditioned Gating for Robust CT Triage Under Distribution Shift
JANUS conditions Vision Transformer embeddings on macro-radiomic priors via anatomically guided gating, reaching macro-AUROC 0.88 on an internal test set of 5082 cases and 0.87 on an external set of 2000 cases while improving calibration and reducing high-confidence false positives under domainshift
-
Enhancing Fine-Grained Spatial Grounding in 3D CT Report Generation via Discriminative Guidance
DCP-PD improves macro F1 scores on CT report generation benchmarks and introduces a hierarchical location-aware evaluation protocol that reveals ongoing challenges in pathology spatial grounding.
-
CT-IDP: Segmentation-Derived Quantitative Phenotypes for Interpretable Abdominal CT Disease Classification
CT-IDP derives over 900 quantitative phenotypes from multi-organ CT segmentations and uses sparse logistic regression to classify diseases, achieving macro-AUCs of 0.897/0.877/0.780 on MERLIN/Duke-Abdomen/AMOS datasets and outperforming a DINOv3 vision transformer baseline.
-
Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.