Sdrt: Enhance vision- language models by self-distillation with diverse reasoning traces

Guande Wu, Huan Song, Yawei Wang, Qiaojing Yan, Yijun Tian, Lin Lee Cheong, Panpan Xu · 2025 · arXiv 2503.01754

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

method 1

use method 1

cs.MM · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Visual debiasing of omni-modal benchmarks combined with staged post-training lets a 3B model match or exceed a 30B model without a stronger teacher.

cs.CV · 2026-05-12 · 3 refs

Showing 2 of 2 citing papers.

Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation cs.MM · 2026-05-12 · unverdicted · none · ref 14 · 2 links
Visual debiasing of omni-modal benchmarks combined with staged post-training lets a 3B model match or exceed a 30B model without a stronger teacher.
Hide to See: Reasoning-prefix Masking for Visual-anchored Thinking in VLM Distillation cs.CV · 2026-05-12 · unreviewed · ref 31 · 3 links