hub

International Conference on Learning Representations , year=

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author=

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

browse 10 citing papers

hub tools

JSON dossier citing papers JSON

representative citing papers

Convergent Stochastic Training of Attention and Understanding LoRA

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Attention and LoRA regression losses induce Poincaré inequalities under mild regularization, so SGD-mimicking SDEs converge to minimizers with no assumptions on data or model size.

LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.

Unlocking Compositional Generalization in Continual Few-Shot Learning

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

A dual-phase framework using self-supervised ViT slots optimizes representations for class identity during training and composes them dynamically at inference to achieve state-of-the-art generalization to unseen concepts with minimal forgetting in continual few-shot learning.

Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.

SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.

PHALAR: Phasors for Learned Musical Audio Representations

cs.SD · 2026-05-05 · unverdicted · novelty 7.0

PHALAR achieves up to 70% relative accuracy gain in stem retrieval with under half the parameters and 7x faster training by using phasor-based equivariant representations, setting new SOTA on multiple datasets.

See What Matters: Differentiable Grid Sample Pruning for Generalizable Vision-Language-Action Model

cs.RO · 2026-05-12 · unverdicted · novelty 6.0

GridS reduces visual tokens in VLA models to under 10% of the original count via task-aware differentiable resampling, delivering 76% lower FLOPs with no drop in task success rate on benchmarks and real robots.

FEFormer: Frequency-enhanced Vision Transformer for Generic Knowledge Extraction and Adaptive Feature Fusion in Volumetric Medical Image Segmentation

eess.IV · 2026-05-12 · unverdicted · novelty 6.0

A frequency-enhanced Vision Transformer with FDSA, FGMLP, WAFF, and FCSB modules delivers superior volumetric medical image segmentation performance and efficiency over prior state-of-the-art methods.

Injecting Distributional Awareness into MLLMs via Reinforcement Learning for Deep Imbalanced Regression

cs.CL · 2026-05-02 · unverdicted · novelty 6.0

A plug-and-play RL method adds batch-level distributional supervision via CCC rewards to reduce regression-to-the-mean in MLLMs on imbalanced regression benchmarks.

PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting

cs.AI · 2026-05-09

citing papers explorer

Showing 10 of 10 citing papers.

Convergent Stochastic Training of Attention and Understanding LoRA cs.LG · 2026-05-08 · unverdicted · none · ref 32
Attention and LoRA regression losses induce Poincaré inequalities under mild regularization, so SGD-mimicking SDEs converge to minimizers with no assumptions on data or model size.
LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection cs.LG · 2026-05-12 · unverdicted · none · ref 23
LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.
Unlocking Compositional Generalization in Continual Few-Shot Learning cs.LG · 2026-05-12 · unverdicted · none · ref 28
A dual-phase framework using self-supervised ViT slots optimizes representations for class identity during training and composes them dynamically at inference to achieve state-of-the-art generalization to unseen concepts with minimal forgetting in continual few-shot learning.
Fix the Loss, Not the Radius: Rethinking the Adversarial Perturbation of Sharpness-Aware Minimization cs.LG · 2026-05-11 · unverdicted · none · ref 56
LE-SAM inverts SAM by fixing the loss budget instead of the parameter-space radius, yielding better generalization across benchmarks.
SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data cs.LG · 2026-05-08 · unverdicted · none · ref 293
SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.
PHALAR: Phasors for Learned Musical Audio Representations cs.SD · 2026-05-05 · unverdicted · none · ref 10
PHALAR achieves up to 70% relative accuracy gain in stem retrieval with under half the parameters and 7x faster training by using phasor-based equivariant representations, setting new SOTA on multiple datasets.
See What Matters: Differentiable Grid Sample Pruning for Generalizable Vision-Language-Action Model cs.RO · 2026-05-12 · unverdicted · none · ref 9
GridS reduces visual tokens in VLA models to under 10% of the original count via task-aware differentiable resampling, delivering 76% lower FLOPs with no drop in task success rate on benchmarks and real robots.
FEFormer: Frequency-enhanced Vision Transformer for Generic Knowledge Extraction and Adaptive Feature Fusion in Volumetric Medical Image Segmentation eess.IV · 2026-05-12 · unverdicted · none · ref 10
A frequency-enhanced Vision Transformer with FDSA, FGMLP, WAFF, and FCSB modules delivers superior volumetric medical image segmentation performance and efficiency over prior state-of-the-art methods.
Injecting Distributional Awareness into MLLMs via Reinforcement Learning for Deep Imbalanced Regression cs.CL · 2026-05-02 · unverdicted · none · ref 82
A plug-and-play RL method adds batch-level distributional supervision via CCC rewards to reduce regression-to-the-mean in MLLMs on imbalanced regression benchmarks.
PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting cs.AI · 2026-05-09 · unreviewed · ref 177

International Conference on Learning Representations , year=

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer