CUBE encodes 3D faces via a grid of learned high-dimensional B-spline features that map parametrically to a base shape plus MLP-refined displacements, enabling dense correspondence and state-of-the-art registration from point clouds or images.
2106.09681 , archivePrefix=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
method 1polarities
use method 1representative citing papers
Trained correlated-photon illumination paired with a Transformer backend improves object classification accuracy by up to 15 percentage points in photon-starved noisy imaging.
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.
ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints and non-shape appearance changes.
Non-linear transformers enable cross-domain generalization in in-context RL by representing value functions from different domains with shared weights inside a shared RKHS.
citing papers explorer
-
Representing 3D Faces with Learnable B-Spline Volumes
CUBE encodes 3D faces via a grid of learned high-dimensional B-spline features that map parametrically to a base shape plus MLP-refined displacements, enabling dense correspondence and state-of-the-art registration from point clouds or images.
-
Ultra-low-light computer vision using trained photon correlations
Trained correlated-photon illumination paired with a Transformer backend improves object classification accuracy by up to 15 percentage points in photon-starved noisy imaging.
-
TextTeacher: What Can Language Teach About Images?
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
-
RT-Transformer: The Transformer Block as a Spherical State Estimator
Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.
-
ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching
ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints and non-shape appearance changes.
-
One for All: A Non-Linear Transformer can Enable Cross-Domain Generalization for In-Context Reinforcement Learning
Non-linear transformers enable cross-domain generalization in in-context RL by representing value functions from different domains with shared weights inside a shared RKHS.