LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.
Dual-forward path teacher knowledge distillation: Bridging the capacity gap between teacher and student
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Mixup applied only to the student during KD induces independent linearity acquisition that reduces overconfidence by an order of magnitude while improving accuracy, with calibration transferring separately from accuracy.
citing papers explorer
-
LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models
LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.
-
Beyond Dark Knowledge: Mixup-Based Distillation for Reliable Predictions
Mixup applied only to the student during KD induces independent linearity acquisition that reduces overconfidence by an order of magnitude while improving accuracy, with calibration transferring separately from accuracy.