FishRoPE reparameterizes attention mechanisms in fisheye images to use angular separation in spherical coordinates, enabling frozen vision foundation models to achieve state-of-the-art results on 2D detection and BEV segmentation benchmarks.
RoFormer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
HiF-VLA improves long-horizon robotic manipulation by encoding past motion as hindsight priors and anticipating future motion through foresight reasoning inside a VLA framework.
citing papers explorer
-
FishRoPE: Projective Rotary Position Embeddings for Omnidirectional Visual Perception
FishRoPE reparameterizes attention mechanisms in fisheye images to use angular separation in spherical coordinates, enabling frozen vision foundation models to achieve state-of-the-art results on 2D detection and BEV segmentation benchmarks.
-
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models
HiF-VLA improves long-horizon robotic manipulation by encoding past motion as hindsight priors and anticipating future motion through foresight reasoning inside a VLA framework.