A dual-resolution self- and cross-attention hybrid model localizes T12-L5 vertebral landmarks in multi-scanner DXA images with normalized mean error 4.92 pixels and median 2.35 pixels, outperforming baselines.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
The OG-ReG Transformer achieves state-of-the-art results on Kinetics-400, Something-Something v2, and Diving-48 by combining global glance and local gaze processing paths.
citing papers explorer
-
VerteNet -- A Multi-Context Hybrid CNN Transformer for Accurate Vertebral Landmark Localization in Lateral Spine DXA Images
A dual-resolution self- and cross-attention hybrid model localizes T12-L5 vertebral landmarks in multi-scanner DXA images with normalized mean error 4.92 pixels and median 2.35 pixels, outperforming baselines.
-
Insights from Visual Cognition: Understanding Human Action Dynamics with Overall Glance and Refined Gaze Transformer
The OG-ReG Transformer achieves state-of-the-art results on Kinetics-400, Something-Something v2, and Diving-48 by combining global glance and local gaze processing paths.