pith. sign in

Manan Tomar, Lior Shani, Yonathan Efroni, and Mohammad Ghavamzadeh

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

citation-role summary

method 3

citation-polarity summary

roles

method 3

polarities

use method 3

representative citing papers

Mirror Descent-Ascent for mean-field min-max problems

math.OC · 2024-02-12 · unverdicted · novelty 7.0

Establishes O(N^{-1/2}) convergence for simultaneous MDA and O(N^{-2/3}) for alternating MDA to mixed Nash equilibria in mean-field convex-concave min-max problems via dual-space Bregman analysis.

Credit Assignment with Resets in Language Model Reasoning

cs.AI · 2026-05-25 · unverdicted · novelty 6.0

The paper introduces Random-Reset Policy Optimization (RRPO) and Self-Reset Policy Optimization (SRPO) that use resets to enable more precise credit assignment in RL for language model reasoning, with SRPO outperforming GRPO and RRPO across benchmarks.

Muon is Scalable for LLM Training

cs.LG · 2025-02-24 · unverdicted · novelty 6.0

Muon optimizer with weight decay and update scaling achieves ~2x efficiency over AdamW for large LLMs, shown via the Moonlight 3B/16B MoE model trained on 5.7T tokens.

citing papers explorer

Showing 10 of 10 citing papers.