More recent methods, such as ONPO (Zhang et al., 2025a) and EGPO (Zhou et al., 2025), introduce optimism and extragradient techniques for stable convergence under noisy preferences

· 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Multiplayer Nash Preference Optimization

cs.AI · 2025-09-27 · unverdicted · novelty 6.0

MNPO extends NLHF to multiplayer Nash games, inheriting equilibrium guarantees while showing empirical gains on instruction-following benchmarks under diverse preferences.

citing papers explorer

Showing 1 of 1 citing paper.

Multiplayer Nash Preference Optimization cs.AI · 2025-09-27 · unverdicted · none · ref 44
MNPO extends NLHF to multiplayer Nash games, inheriting equilibrium guarantees while showing empirical gains on instruction-following benchmarks under diverse preferences.

More recent methods, such as ONPO (Zhang et al., 2025a) and EGPO (Zhou et al., 2025), introduce optimism and extragradient techniques for stable convergence under noisy preferences

fields

years

verdicts

representative citing papers

citing papers explorer