pith. machine review for the scientific record. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.LG 3

years

2026 3

verdicts

UNVERDICTED 3

representative citing papers

Solve the Loop: Attractor Models for Language and Reasoning

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Attractor Models solve for fixed points in transformer embeddings using implicit differentiation to enable stable iterative refinement, delivering better perplexity, accuracy, and efficiency than standard or looped transformers.

Muon Does Not Converge on Convex Lipschitz Functions

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

Muon does not converge on convex Lipschitz functions regardless of learning rate, while error feedback restores theoretical convergence but degrades performance on CIFAR-10 and nanoGPT tasks.

Can Muon Fine-tune Adam-Pretrained Models?

cs.LG · 2026-05-11 · unverdicted · novelty 4.0

Constraining fine-tuning updates with LoRA mitigates performance degradation when switching from Adam to Muon on pretrained models.

citing papers explorer

Showing 3 of 3 citing papers.

  • Solve the Loop: Attractor Models for Language and Reasoning cs.LG · 2026-05-12 · unverdicted · none · ref 37

    Attractor Models solve for fixed points in transformer embeddings using implicit differentiation to enable stable iterative refinement, delivering better perplexity, accuracy, and efficiency than standard or looped transformers.

  • Muon Does Not Converge on Convex Lipschitz Functions cs.LG · 2026-05-09 · unverdicted · none · ref 69

    Muon does not converge on convex Lipschitz functions regardless of learning rate, while error feedback restores theoretical convergence but degrades performance on CIFAR-10 and nanoGPT tasks.

  • Can Muon Fine-tune Adam-Pretrained Models? cs.LG · 2026-05-11 · unverdicted · none · ref 8

    Constraining fine-tuning updates with LoRA mitigates performance degradation when switching from Adam to Muon on pretrained models.