CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.
arXiv preprint arXiv:2509.13758 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Case study of 12 LLM agent pairs on Fibonacci game development finds only DeepSeek-R1:DeepSeek-R1 converges correctly from the first iteration while others either diverge or fail to converge.
citing papers explorer
-
Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding
CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.
-
Understanding Conversational Patterns in Multi-agent Programming: A Case Study on Fibonacci Game Development
Case study of 12 LLM agent pairs on Fibonacci game development finds only DeepSeek-R1:DeepSeek-R1 converges correctly from the first iteration while others either diverge or fail to converge.