← back to paper
arxiv: 2606.24004 · 2 revisions
Towards Spec Learning: Inference-Time Alignment from Preference Pairs