A vision-tabular multimodal transformer uses modality tokens, masked self-attention, and stochastic modality dropout to maintain performance under pervasive missing data on MIMIC-CXR and MIMIC-IV for 14-label diagnostic classification.
Navigating the landscape of multimodal AI in medicine: a scoping review on technical challenges and clinical applications
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Resilient Vision-Tabular Multimodal Learning under Modality Missingness
A vision-tabular multimodal transformer uses modality tokens, masked self-attention, and stochastic modality dropout to maintain performance under pervasive missing data on MIMIC-CXR and MIMIC-IV for 14-label diagnostic classification.