Born again neural networks

Born Again Neural Networks , author= · 2018 · arXiv 1805.04770

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

TallyTrain: Communication-Efficient Federated Distillation

cs.LG · 2026-06-30 · unverdicted · novelty 7.0

TallyTrain is a hard-label distillation protocol for federated learning that uses argmax transmission and optional sparse merges to match soft-label performance at up to 1000x lower communication cost.

Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed

cs.LG · 2021-01-07 · unverdicted · novelty 6.0

Denoising Student distills the multi-step denoising process of score-based and diffusion models into a single forward pass, matching GAN sampling speed while producing comparable sample quality on CIFAR-10, CelebA, and 256x256 LSUN.

q0: Primitives for Hyper-Epoch Pretraining

cs.LG · 2026-06-02 · unverdicted · novelty 5.0

q0 turns multi-epoch budgets into diverse model populations using three primitives that outperform single-model training and strong ensembles with fewer epochs on a 1.8B model.

Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning

cs.CV · 2019-07-08 · unverdicted · novelty 4.0

Proposes Distill-2MD-MTL, an MTL-based data distillation framework for semi-supervised multi-domain face analysis tasks that claims better performance than single-task baselines.

citing papers explorer

Showing 4 of 4 citing papers.

TallyTrain: Communication-Efficient Federated Distillation cs.LG · 2026-06-30 · unverdicted · none · ref 35
TallyTrain is a hard-label distillation protocol for federated learning that uses argmax transmission and optional sparse merges to match soft-label performance at up to 1000x lower communication cost.
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed cs.LG · 2021-01-07 · unverdicted · none · ref 7
Denoising Student distills the multi-step denoising process of score-based and diffusion models into a single forward pass, matching GAN sampling speed while producing comparable sample quality on CIFAR-10, CelebA, and 256x256 LSUN.
q0: Primitives for Hyper-Epoch Pretraining cs.LG · 2026-06-02 · unverdicted · none · ref 21
q0 turns multi-epoch budgets into diverse model populations using three primitives that outperform single-model training and strong ensembles with fewer epochs on a 1.8B model.
Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning cs.CV · 2019-07-08 · unverdicted · none · ref 9
Proposes Distill-2MD-MTL, an MTL-based data distillation framework for semi-supervised multi-domain face analysis tasks that claims better performance than single-task baselines.

Born again neural networks

fields

years

verdicts

representative citing papers

citing papers explorer