ReasonRank synthesizes reasoning-intensive training data using DeepSeek-R1 and applies a two-stage SFT plus RL process with a novel multi-view ranking reward to create a listwise reranker that outperforms baselines with lower latency than pointwise methods.
The lora parameters rank and alpha are both set to 32
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
ReasonRank synthesizes reasoning-intensive training data using DeepSeek-R1 and applies a two-stage SFT plus RL process with a novel multi-view ranking reward to create a listwise reranker that outperforms baselines with lower latency than pointwise methods.