Multimodal 3D CNN model with GMU, gated self-attention, and sparsely gated MoE achieves up to 95.47% accuracy on NC vs AD using MRI and PET, with ablations showing MoE benefit.
Multimodal unified attention networks for vision-and-language interactions,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ViASNet applies a 3D U-Net architecture augmented with audio and semantic inputs to predict dynamic saliency in video ads and uses frame-wise entropy to diagnose low-engagement scenes on eye-tracked data from 151 ads.
citing papers explorer
-
Alzheimer's Disease Diagnosis using a Multimodal Approach with 3D MRI and PET
Multimodal 3D CNN model with GMU, gated self-attention, and sparsely gated MoE achieves up to 95.47% accuracy on NC vs AD using MRI and PET, with ablations showing MoE benefit.
-
ViASNet: A Video Ad Saliency Network for Predicting Dynamic Saliency and Viewer Engagement
ViASNet applies a 3D U-Net architecture augmented with audio and semantic inputs to predict dynamic saliency in video ads and uses frame-wise entropy to diagnose low-engagement scenes on eye-tracked data from 151 ads.