Derives upper and lower generalization bounds for the student relative to the teacher using a new distillation divergence, plus a loss-sharpness-aware bound and a bias-variance-rank decomposition in the linear Gaussian case.
In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
An autoencoder with minimal latent entropy loss enables fully unsupervised video anomaly detection by concentrating normal latent embeddings and producing poor reconstructions for anomalies.
citing papers explorer
-
On the Generalization of Knowledge Distillation: An Information-Theoretic View
Derives upper and lower generalization bounds for the student relative to the teacher using a new distillation divergence, plus a loss-sharpness-aware bound and a bias-variance-rank decomposition in the linear Gaussian case.
-
MLE-UVAD: Minimal Latent Entropy Autoencoder for Fully Unsupervised Video Anomaly Detection
An autoencoder with minimal latent entropy loss enables fully unsupervised video anomaly detection by concentrating normal latent embeddings and producing poor reconstructions for anomalies.