Wiley New York

R Duncan Luce et al · 1959

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

method 2 background 1

citation-polarity summary

use method 2 background 1

representative citing papers

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.

Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

GraphDPO generalizes pairwise DPO to a graph-structured Plackett-Luce objective over DAGs induced by rollout rankings, enforcing transitivity with linear complexity and recovering DPO as a special case.

ModelLens: Finding the Best for Your Task from Myriads of Models

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

ModelLens learns a performance-aware latent space from 1.62M leaderboard records to rank unseen models on unseen datasets without forward passes on the target.

Heterogeneous Judge-Aware Ranking with Sensitivity, Disagreement, and Confidence

stat.ME · 2026-05-06 · unverdicted · novelty 6.0

HJA ranking separates consensus ranking, judge sensitivity, and residual disagreement as distinct inferential targets with identifiability conditions and an anchored alternating algorithm, yielding better recovery and uncertainty calibration than pooled baselines on synthetic and real data.

citing papers explorer

Showing 4 of 4 citing papers.

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization cs.LG · 2026-05-11 · unverdicted · none · ref 39
MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph cs.LG · 2026-05-08 · unverdicted · none · ref 27
GraphDPO generalizes pairwise DPO to a graph-structured Plackett-Luce objective over DAGs induced by rollout rankings, enforcing transitivity with linear complexity and recovering DPO as a special case.
ModelLens: Finding the Best for Your Task from Myriads of Models cs.LG · 2026-05-08 · unverdicted · none · ref 31
ModelLens learns a performance-aware latent space from 1.62M leaderboard records to rank unseen models on unseen datasets without forward passes on the target.
Heterogeneous Judge-Aware Ranking with Sensitivity, Disagreement, and Confidence stat.ME · 2026-05-06 · unverdicted · none · ref 26
HJA ranking separates consensus ranking, judge sensitivity, and residual disagreement as distinct inferential targets with identifiability conditions and an anchored alternating algorithm, yielding better recovery and uncertainty calibration than pooled baselines on synthetic and real data.

Wiley New York

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer