Zico Kolter, and Vladlen Koltun

Shaojie Bai, J · 2019 · arXiv 1909.01377

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

representative citing papers

Stability and Generalization in Looped Transformers

cs.LG · 2026-04-16 · unverdicted · novelty 8.0

Looped transformers with recall and outer normalization produce reachable, input-dependent fixed points with stable gradients, enabling generalization, while those without recall cannot; a new internal recall variant performs competitively or better.

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

cs.LG · 2025-02-07 · unverdicted · novelty 7.0

A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.

Quantum Injection Pathways for Implicit Graph Neural Networks

quant-ph · 2026-05-09 · unverdicted · novelty 6.0

Independent quantum signal injection into graph DEQs yields higher test accuracy and fewer solver iterations than state-dependent or backbone-dependent injection and classical equilibrium models on NCI1, PROTEINS, and MUTAG benchmarks.

State Stream Transformer (SST) V2: Parallel Training of Nonlinear Recurrence for Latent Space Reasoning

cs.LG · 2026-04-30 · unverdicted · novelty 6.0

SST V2 introduces parallel-trainable nonlinear recurrence in latent space to let transformers reason continuously across positions, delivering +15 points on GPQA-Diamond and halving remaining GSM8K errors over matched baselines.

Parcae: Scaling Laws For Stable Looped Language Models

cs.LG · 2026-04-14 · unverdicted · novelty 6.0

Parcae stabilizes looped LLMs via spectral norm constraints on injection parameters, enabling power-law scaling for training FLOPs and saturating exponential scaling at test time that improves quality over fixed-depth baselines under fixed parameter budgets.

ELT: Elastic Looped Transformers for Visual Generation

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.

A Nonlinear Separation Principle via Contraction Theory: Applications to Neural Networks, Control, and Learning

eess.SY · 2026-04-16 · unverdicted · novelty 5.0

A contraction-theory separation principle yields global exponential stability for controller-observer pairs and sharp LMI certificates for contractive RNNs, enabling stable output tracking and implicit neural network design.

citing papers explorer

Showing 7 of 7 citing papers.

Stability and Generalization in Looped Transformers cs.LG · 2026-04-16 · unverdicted · none · ref 2
Looped transformers with recall and outer normalization produce reachable, input-dependent fixed points with stable gradients, enabling generalization, while those without recall cannot; a new internal recall variant performs competitively or better.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach cs.LG · 2025-02-07 · unverdicted · none · ref 12
A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.
Quantum Injection Pathways for Implicit Graph Neural Networks quant-ph · 2026-05-09 · unverdicted · none · ref 7
Independent quantum signal injection into graph DEQs yields higher test accuracy and fewer solver iterations than state-dependent or backbone-dependent injection and classical equilibrium models on NCI1, PROTEINS, and MUTAG benchmarks.
State Stream Transformer (SST) V2: Parallel Training of Nonlinear Recurrence for Latent Space Reasoning cs.LG · 2026-04-30 · unverdicted · none · ref 50
SST V2 introduces parallel-trainable nonlinear recurrence in latent space to let transformers reason continuously across positions, delivering +15 points on GPQA-Diamond and halving remaining GSM8K errors over matched baselines.
Parcae: Scaling Laws For Stable Looped Language Models cs.LG · 2026-04-14 · unverdicted · none · ref 7
Parcae stabilizes looped LLMs via spectral norm constraints on injection parameters, enabling power-law scaling for training FLOPs and saturating exponential scaling at test time that improves quality over fixed-depth baselines under fixed parameter budgets.
ELT: Elastic Looped Transformers for Visual Generation cs.CV · 2026-04-10 · unverdicted · none · ref 4
Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.
A Nonlinear Separation Principle via Contraction Theory: Applications to Neural Networks, Control, and Learning eess.SY · 2026-04-16 · unverdicted · none · ref 3
A contraction-theory separation principle yields global exponential stability for controller-observer pairs and sharp LMI certificates for contractive RNNs, enabling stable output tracking and implicit neural network design.

Zico Kolter, and Vladlen Koltun

fields

years

verdicts

representative citing papers

citing papers explorer