International Conference on Learning Representations (ICLR) , year=

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author=

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

How to Scale Mixture-of-Experts: From muP to the Maximally Scale-Stable Parameterization

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

The authors derive a Maximally Scale-Stable Parameterization (MSSP) for MoE models that achieves robust learning-rate transfer and monotonic performance gains with scale across co-scaling regimes of width, experts, and sparsity.

ViT-K: A Few-Shot Learning Model for Coupled Fluid-Porous Media Flows with Interface Conditions

math.NA · 2026-05-13 · unverdicted · novelty 7.0

ViT-K uses Vision Transformers and Koopman operators to learn stable long-term spatiotemporal dynamics of coupled fluid-porous media flows from sparse data.

The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence?

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.

Linking Extreme Discourse to Structural Polarization in Signed Interaction Networks

cs.SI · 2026-05-12 · unverdicted · novelty 6.0

A pipeline derives continuous signed edges from LLM stance scores on text and links discourse signals such as toxicity and extreme claims to changes in structural polarization measured by spectral and frustration scores on Reddit Brexit data.

Learning Quantifiable Visual Explanations Without Ground-Truth

cs.AI · 2026-05-18 · unverdicted · novelty 5.0

A perturbation-based metric for XAI quality that formalizes sufficiency and necessity, paired with an adapter trained via differentiable supervision to generate causal explanations on black-box models.

citing papers explorer

Showing 5 of 5 citing papers.

How to Scale Mixture-of-Experts: From muP to the Maximally Scale-Stable Parameterization cs.LG · 2026-05-13 · unverdicted · none · ref 62
The authors derive a Maximally Scale-Stable Parameterization (MSSP) for MoE models that achieves robust learning-rate transfer and monotonic performance gains with scale across co-scaling regimes of width, experts, and sparsity.
ViT-K: A Few-Shot Learning Model for Coupled Fluid-Porous Media Flows with Interface Conditions math.NA · 2026-05-13 · unverdicted · none · ref 42
ViT-K uses Vision Transformers and Koopman operators to learn stable long-term spatiotemporal dynamics of coupled fluid-porous media flows from sparse data.
The Wittgensteinian Representation Hypothesis: Is Language the Attractor of Multimodal Convergence? cs.AI · 2026-05-10 · unverdicted · none · ref 18
Language representations serve as the asymptotic attractor for convergence in independently trained multimodal neural networks due to feature density asymmetry.
Linking Extreme Discourse to Structural Polarization in Signed Interaction Networks cs.SI · 2026-05-12 · unverdicted · none · ref 46
A pipeline derives continuous signed edges from LLM stance scores on text and links discourse signals such as toxicity and extreme claims to changes in structural polarization measured by spectral and frustration scores on Reddit Brexit data.
Learning Quantifiable Visual Explanations Without Ground-Truth cs.AI · 2026-05-18 · unverdicted · none · ref 70
A perturbation-based metric for XAI quality that formalizes sufficiency and necessity, paired with an adapter trained via differentiable supervision to generate causal explanations on black-box models.

International Conference on Learning Representations (ICLR) , year=

fields

years

verdicts

representative citing papers

citing papers explorer