MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning.Advances in neural information processing systems, 32
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A differentiable neural operator learns the mapping from granular microstructure configurations to failure envelopes, with physics-informed convexity enforcement and active learning for efficient training.
citing papers explorer
-
MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization
MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
-
Neural Operator Representation of Granular Micromechanics-based Failure Envelope
A differentiable neural operator learns the mapping from granular microstructure configurations to failure envelopes, with physics-informed convexity enforcement and active learning for efficient training.