pith. sign in

LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation.International Conference on Learning Representations (ICLR), 2026

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 6

verdicts

UNVERDICTED 6

roles

background 1

polarities

background 1

representative citing papers

LoopQ: Quantization for Recursive Transformers

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

LoopQ provides a loop-aware PTQ framework for recursive Transformers that mitigates distribution shift, state reuse, and recursive error accumulation, yielding 68.8% higher average accuracy and 87.7% lower perplexity under W4A4 versus static baselines.

Looped World Models

cs.LG · 2026-06-16 · unverdicted · novelty 6.0

Introduces looped transformer architectures for world models that iteratively refine latent states to achieve up to 100x parameter efficiency via adaptive computation depth.

Fixed-Point Masked Generative Modeling

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

FP-MGMs with consistency loss and three-state reuse (CoFRe) reduce parameters by up to 38.8% and improve low-budget perplexity and FID versus standard masked generative models on text and images.

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

citing papers explorer

Showing 6 of 6 citing papers.

  • LoopQ: Quantization for Recursive Transformers cs.LG · 2026-05-08 · unverdicted · none · ref 17

    LoopQ provides a loop-aware PTQ framework for recursive Transformers that mitigates distribution shift, state reuse, and recursive error accumulation, yielding 68.8% higher average accuracy and 87.7% lower perplexity under W4A4 versus static baselines.

  • Looped World Models cs.LG · 2026-06-16 · unverdicted · none · ref 13

    Introduces looped transformer architectures for world models that iteratively refine latent states to achieve up to 100x parameter efficiency via adaptive computation depth.

  • Fixed-Point Reasoners: Stable and Adaptive Deep Looped Transformers cs.AI · 2026-06-16 · unverdicted · none · ref 79

    FPRM is a Transformer-based model using fixed-point convergence for adaptive halting in looped architectures, claimed effective on Sudoku, Maze, state-tracking, and ARC-AGI benchmarks.

  • Fixed-Point Masked Generative Modeling cs.LG · 2026-05-29 · unverdicted · none · ref 36

    FP-MGMs with consistency loss and three-state reuse (CoFRe) reduce parameters by up to 38.8% and improve low-budget perplexity and FID versus standard masked generative models on text and images.

  • Elastic Attention Cores for Scalable Vision Transformers cs.CV · 2026-05-12 · unverdicted · none · ref 43

    VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

  • HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering cs.AI · 2026-04-22 · unverdicted · none · ref 62

    HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.