Recognition: unknown
Real Image Denoising with Knowledge Distillation for High-Performance Mobile NPUs
Pith reviewed 2026-05-07 17:55 UTC · model grok-4.3
The pith
A 1.96M-parameter LiteDenoiseNet student model achieves 37.58 dB PSNR on full-resolution real image denoising benchmarks while running in 34-46 ms on mobile NPUs by leveraging NPU-compatible primitives and high-alpha knowledge distillation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The 1.96M-parameter student recovers 99.8% of the teacher's restoration quality via high-alpha knowledge distillation (alpha = 0.9), achieving a 21.2x parameter reduction while closing the PSNR gap from 1.63 dB to only 0.05 dB.
Load-bearing premise
That restricting the student to NPU-native primitives (3x3 convolutions, ReLU, nearest-neighbor upsampling) combined with progressive context expansion up to 1024x1024 crops will preserve generalization on real-world noisy images without significant quality loss outside the specific Mobile AI 2026 benchmarks.
Figures
read the original abstract
While deep-learning-based image restoration has achieved unprecedented fidelity, deployment on mobile Neural Processing Units (NPUs) remains bottlenecked by operator incompatibility and memory-access overhead. We propose an NPU-aware hardware-algorithm co-design approach for real-world image denoising on mobile NPUs. Our approach employs a high-capacity teacher to supervise a lightweight student network specifically designed to leverage the tiled-memory architectures of modern mobile SoCs. By prioritizing NPU-native primitives -- standard 3x3 convolutions, ReLU activations, and nearest-neighbor upsampling -- and employing a progressive context expansion strategy (up to 1024x1024 crops), the model achieves 37.66 dB PSNR / 0.9278 SSIM on the validation benchmark and 37.58 dB PSNR / 0.9098 SSIM on the held-out test benchmark at full resolution (2432x3200) in the Mobile AI 2026 challenge. Following the official challenge rules, the inference runtime is measured under a standardized Full HD (1088x1920) protocol, where it runs in 34.0 ms on the MediaTek Dimensity 9500 and 46.1 ms on the Qualcomm Snapdragon 8 Elite NPU. We further reveal an "Inference Inversion" effect, where strict adherence to NPU-compatible operations enables dedicated NPU execution up to 3.88x faster than the integrated mobile GPU. The 1.96M-parameter student recovers 99.8% of the teacher's restoration quality via high-alpha knowledge distillation (alpha = 0.9), achieving a 21.2x parameter reduction while closing the PSNR gap from 1.63 dB to only 0.05 dB. These results establish hardware-aware distillation as an effective strategy for unifying high-fidelity denoising with practical deployment across diverse mobile NPU architectures. The proposed lightweight student model (LiteDenoiseNet) and its training statistics are provided in the NN Dataset, available at https://github.com/ABrain-One/NN-Dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a hardware-algorithm co-design for real-world image denoising on mobile NPUs. A high-capacity teacher supervises a 1.96M-parameter student (LiteDenoiseNet) restricted to NPU-native primitives (3x3 convolutions, ReLU, nearest-neighbor upsampling) with progressive context expansion up to 1024x1024 crops. It reports 37.66 dB PSNR / 0.9278 SSIM on validation and 37.58 dB PSNR / 0.9098 SSIM on held-out test at full resolution (2432x3200) for the Mobile AI 2026 challenge, with runtimes of 34.0 ms (MediaTek Dimensity 9500) and 46.1 ms (Qualcomm Snapdragon 8 Elite) under standardized Full HD protocol. The student recovers 99.8% of teacher quality via alpha=0.9 knowledge distillation, closing the PSNR gap from 1.63 dB to 0.05 dB (21.2x parameter reduction), and demonstrates an 'Inference Inversion' effect with up to 3.88x NPU speedup over GPU. The model and training statistics are publicly released.
Significance. If the reported metrics hold under the stated protocols, the work is significant for practical deployment of high-fidelity denoising on diverse mobile NPUs. It provides concrete, falsifiable evidence via held-out test metrics, standardized runtime measurements, and public code/dataset release at https://github.com/ABrain-One/NN-Dataset, enabling direct verification of the 99.8% recovery claim and 21.2x reduction. The NPU-aware design turning operator constraints into a performance advantage (Inference Inversion) offers a reproducible template for hardware-aware distillation in computer vision.
major comments (2)
- §5 (Results): The central claim that the student recovers 99.8% of teacher quality and closes the PSNR gap from 1.63 dB to 0.05 dB requires the teacher's absolute PSNR/SSIM values to be stated explicitly alongside the student's, together with the precise formula used for the recovery percentage; without these, independent verification of the gap closure is not possible from the reported numbers alone.
- §4 (Experimental setup): The manuscript reports concrete PSNR/SSIM on validation and held-out test sets but does not specify the data splits, number of random seeds, or error bars; these details are load-bearing for assessing the statistical reliability of the 0.05 dB gap and the generalization of the progressive crop + NPU-primitive design beyond the Mobile AI 2026 benchmarks.
minor comments (2)
- Abstract: The phrase 'following the official challenge rules' for runtime measurement should be expanded with a one-sentence reference to the exact benchmark dataset and resolution protocol used for the PSNR/SSIM figures.
- §3 (Method): The description of the progressive context expansion schedule would benefit from a small table listing the crop sizes and corresponding training epochs to improve reproducibility.
Axiom & Free-Parameter Ledger
free parameters (2)
- alpha
- crop_size_schedule
axioms (1)
- domain assumption NPU-native primitives (3x3 conv, ReLU, nearest-neighbor upsampling) are sufficient to represent high-quality denoising mappings when supervised by a teacher.
Reference graph
Works this paper leans on
-
[1]
Abdelrahman Abdelhamed, Stephen Lin, and Michael S. Brown. A high-quality denoising dataset for smartphone cameras. InCVPR, pages 1692–1700, 2018. 2
2018
-
[2]
Abdelrahman Abdelhamed, Radu Timofte, and Michael S. Brown. NTIRE 2019 challenge on real image denoising: Methods and results. InCVPRW, pages 0–0, 2019. 2
2019
-
[3]
Abdelrahman Abdelhamed, Radu Timofte, and Michael S. Brown. NTIRE 2020 challenge on real image denoising: Dataset, methods and results. InCVPRW, pages 496–497,
2020
-
[4]
Real image denoising with feature attention
Saeed Anwar and Nick Barnes. Real image denoising with feature attention. InICCV, pages 3155–3164, 2019. 2
2019
-
[5]
Hinet: Half instance normalization network for image restoration
Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Cheng- peng Chen. Hinet: Half instance normalization network for image restoration. InCVPRW, pages 182–192, 2021. 2
2021
-
[6]
Simple baselines for image restoration
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InECCV, pages 17– 33, 2022. 2
2022
-
[7]
AI on the edge: An automated pipeline for PyTorch-to-Android deployment and benchmarking.Preprints, 2025
Saif U Din, Muhammad Ahsan Hussain, Mohsin Ikram, Dmitry Ignatov, and Radu Timofte. AI on the edge: An automated pipeline for PyTorch-to-Android deployment and benchmarking.Preprints, 2025. 5
2025
-
[8]
LIR: A lightweight baseline for image restoration
Dongqi Fan, Ting Yue, Xin Zhao, Renjing Xu, and Liang Chang. LIR: A lightweight baseline for image restoration. arXiv preprint, arXiv:2402.01368, 2024. 3
-
[9]
Real-world mobile image denoising dataset with efficient baselines
Roman Flepp, Andrey Ignatov, Radu Timofte, and Luc Van Gool. Real-world mobile image denoising dataset with efficient baselines. InCVPR, pages 22743–22753, 2024. 2, 3
2024
-
[10]
LEMUR neural net- work dataset: Towards seamless AutoML.arXiv preprint, arXiv:2504.10552, 2025
Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Ben- tyn, Dmitry Ignatov, and Radu Timofte. LEMUR neural net- work dataset: Towards seamless AutoML.arXiv preprint, arXiv:2504.10552, 2025. 5
-
[11]
Toward convolutional blind denoising of real pho- tographs
Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. Toward convolutional blind denoising of real pho- tographs. InCVPR, pages 1712–1722, 2019. 2
2019
-
[12]
Distilling the Knowledge in a Neural Network
Geoffrey Hinton, Oriol Vinyals, and Jeffrey Dean. Distilling the knowledge in a neural network.CoRR, abs/1503.02531,
work page internal anchor Pith review arXiv
-
[13]
Fast camera image denoising on mobile GPUs with deep learning, mobile AI 2021 challenge: Report
Andrey Ignatov, Byeoung-su Kim, Radu Timofte, et al. Fast camera image denoising on mobile GPUs with deep learning, mobile AI 2021 challenge: Report. InCVPRW, 2021. 3, 8
2021
-
[14]
Efficient image denoising on smartphone GPUs: Mobile AI 2026 challenge report
Andrey Ignatov, Georgy Perevozchikov, Radu Timofte, et al. Efficient image denoising on smartphone GPUs: Mobile AI 2026 challenge report. InCVPRW, 2026. To appear. 3, 8
2026
-
[15]
Lightweight modules for ef- ficient deep learning based image restoration.IEEE Trans- actions on Circuits and Systems for Video Technology, 31(3): 1168–1180, 2021
Avisek Lahiri, Sourav Bairagya, Sutanu Bera, Siddhant Hal- dar, and Prabir Kumar Biswas. Lightweight modules for ef- ficient deep learning based image restoration.IEEE Trans- actions on Circuits and Systems for Video Technology, 31(3): 1168–1180, 2021. 3
2021
-
[16]
Multiple degradation and reconstruction network for single image denoising via knowledge distillation
Juncheng Li, Hanhui Yang, Qiaosi Yi, Faming Fang, Guang- wei Gao, Tieyong Zeng, and Guixu Zhang. Multiple degradation and reconstruction network for single image denoising via knowledge distillation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 471–480, 2022. 2
2022
-
[17]
Swinir: Image restoration us- ing swin transformer
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration us- ing swin transformer. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision Workshops, pages 1833–1844, 2021. 2
2021
-
[18]
Zhuoqun Liu, Meiguang Jin, Ying Chen, Huaida Liu, Can- qian Yang, and Hongkai Xiong. Lightweight network to- wards real-time image denoising on mobile devices.arXiv preprint, arXiv:2211.04687, 2022. 3
-
[19]
Mobile AI workshop 2026.https://ai-benchmark.com/workshops/ mai/2026/, 2026
MAI 2026 Workshop Organizers. Mobile AI workshop 2026.https://ai-benchmark.com/workshops/ mai/2026/, 2026. Accessed: 2026-03-17. 3, 8
2026
-
[20]
Benchmarking denoising al- gorithms with real photographs
Tobias Pl ¨otz and Stefan Roth. Benchmarking denoising al- gorithms with real photographs. InCVPR, pages 1586–1595,
-
[21]
Fitnets: Hints for thin deep nets
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. Fitnets: Hints for thin deep nets. InICLR, 2015. 2
2015
-
[22]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMedical Image Computing and Computer-Assisted Inter- vention (MICCAI), pages 234–241, 2015. 3, 5
2015
-
[23]
Practical deep raw image denoising on mobile devices
Yuzhi Wang, Haibin Huang, Qin Xu, Jiaming Liu, Yiqun Liu, and Jue Wang. Practical deep raw image denoising on mobile devices. InECCV, pages 1–16, 2020. 3
2020
-
[24]
Uformer: A gen- eral u-shaped transformer for image restoration
Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A gen- eral u-shaped transformer for image restoration. InCVPR, pages 17683–17693, 2022. 2
2022
-
[25]
Young, Fitsum A
Lucas D. Young, Fitsum A. Reda, Rakesh Ranjan, Jon Morton, Jun Hu, Yazhu Ling, Xiaoyu Xiang, David Liu, and Vikas Chandra. Feature-align network with knowledge distillation for efficient denoising. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, pages 1–10, 2022. 2
2022
-
[26]
Learning enriched features for real image restoration and enhancement
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Learning enriched features for real image restoration and enhancement. InECCV, pages 492–511, 2020. 2
2020
-
[27]
Multi-stage progressive image restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. InCVPR, pages 14821–14831, 2021
2021
-
[28]
Restormer: Efficient transformer for high-resolution image restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InCVPR, pages 5728–5739, 2022. 2
2022
-
[29]
Beyond a gaussian denoiser: Residual learning of deep CNN for image denoising
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep CNN for image denoising. InCVPR, pages 3929–3937,
-
[30]
FFDNet: To- ward a fast and flexible solution for CNN-based image de- noising.IEEE TIP, 27(9):4608–4622, 2018
Kai Zhang, Wangmeng Zuo, and Lei Zhang. FFDNet: To- ward a fast and flexible solution for CNN-based image de- noising.IEEE TIP, 27(9):4608–4622, 2018. 2
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.