pith. sign in

arxiv: 2211.16231 · v3 · pith:3KIY2JR5new · submitted 2022-11-29 · 💻 cs.CV

Curriculum Temperature for Knowledge Distillation

classification 💻 cs.CV
keywords distillationtemperaturedifficultyctkdcurriculumknowledgeleveltask
0
0 comments X
read the original abstract

Most existing distillation methods ignore the flexible role of the temperature in the loss function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In general, the temperature controls the discrepancy between two distributions and can faithfully determine the difficulty level of the distillation task. Keeping a constant temperature, i.e., a fixed level of task difficulty, is usually sub-optimal for a growing student during its progressive learning stages. In this paper, we propose a simple curriculum-based technique, termed Curriculum Temperature for Knowledge Distillation (CTKD), which controls the task difficulty level during the student's learning career through a dynamic and learnable temperature. Specifically, following an easy-to-hard curriculum, we gradually increase the distillation loss w.r.t. the temperature, leading to increased distillation difficulty in an adversarial manner. As an easy-to-use plug-in technique, CTKD can be seamlessly integrated into existing knowledge distillation frameworks and brings general improvements at a negligible additional computation cost. Extensive experiments on CIFAR-100, ImageNet-2012, and MS-COCO demonstrate the effectiveness of our method. Our code is available at https://github.com/zhengli97/CTKD.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. APRIL-MedSeg: A Modular Medical Image Segmentation Toolbox Embracing Modern Paradigms

    cs.CV 2026-06 unverdicted novelty 5.0

    APRIL-MedSeg is a new open-source modular toolbox that uses YAML configuration and component registries to unify multiple advanced paradigms for medical image segmentation.

  2. LEAP: Layer-skipping Efficiency via Adaptive Progression for Vision Transformer Distillation

    cs.CV 2026-06 unverdicted novelty 5.0

    LEAP is an adaptive layer-skipping curriculum for ViT feature distillation that reports accuracy gains on ImageNet and retrieval tasks plus training compute savings.

  3. APRIL-MedSeg: A Modular Medical Image Segmentation Toolbox Embracing Modern Paradigms

    cs.CV 2026-06 unverdicted novelty 4.0

    Presents APRIL-MedSeg, a modular YAML-configurable toolbox for 2D medical image segmentation integrating semi-supervised, domain adaptation, distillation, weakly supervised, text-guided, and foundation model paradigms...