pith. machine review for the scientific record. sign in

arxiv: 2605.07810 · v1 · submitted 2026-05-08 · ⚛️ physics.optics · cs.CV

Recognition: 2 theorem links

· Lean Theorem

Pre-training Enables Extraordinary All-optical Image Denoising

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:50 UTC · model grok-4.3

classification ⚛️ physics.optics cs.CV
keywords optical neural networksdiffractive networksimage denoisingpre-trainingtransfer learningall-optical computingfree-space optics
0
0 comments X

The pith

Pre-training diffractive networks enables all-optical denoising of severely noisy images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Optical neural networks process information with light for potential gains in speed and efficiency over electronic systems. This paper shows that pre-training a diffractive network on 3.45 million simple images before fine-tuning it on noisy task-specific data produces better denoising than direct training or standard Fourier filtering. The method handles extreme noise levels that drop input PSNR below 8 dB and raises output quality above 18 dB while keeping fine details. It works across image types from digits and X-rays to natural scenes and faces, and improves downstream tasks such as face detection and object localization in noise.

Core claim

The authors establish that a two-step pre-training and fine-tuning process for diffractive networks, using a massive dataset of 3.45 million diverse simple images followed by task-specific fine-tuning, enables snapshot all-optical image denoising that outperforms conventional Fourier-domain filtering and directly trained networks, particularly for severe noise with input PSNR below 8 dB, achieving output PSNR above 18 dB while preserving fine image features, and generalizes consistently across highly diverse image domains including MNIST, ChestMNIST, CIFAR-10, and CelebA.

What carries the argument

The two-step transfer learning process on the diffractive network, with pre-training on a large set of simple images to build general capability before adapting to specific noisy datasets.

If this is right

  • The method improves denoising quality for images with severe noise while preserving fine details better than direct training or Fourier filtering.
  • The same pre-trained network can be fine-tuned consistently for image styles ranging from handwritten digits and medical scans to natural scenes and faces.
  • It enhances accuracy in vision applications such as face detection, license plate recognition, and UAV localization under noisy conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Physical optical computing devices may need large-scale pre-training stages to reach performance levels comparable to digital models.
  • The approach points toward compact, low-energy optical front-ends that could perform denoising directly at the sensor before any electronic processing.
  • The same pre-training strategy might extend to other all-optical tasks such as classification or feature extraction in free-space systems.

Load-bearing premise

Simulated performance of the pre-trained and fine-tuned diffractive network will translate directly to a physical free-space optical system without major degradation from fabrication errors, alignment issues, or unmodeled aberrations.

What would settle it

A physical free-space implementation of the pre-trained diffractive network that fails to raise PSNR from below 8 dB to above 18 dB on severely noisy test images or loses fine features would falsify the claimed advantage of the transfer learning approach.

read the original abstract

Optical neural networks are emerging as powerful machine learning and information processing tools because of their potential advantages in speed and energy efficiency. The training methods of these physical models, however, remain underexplored compared to their digital counterparts and are leading to suboptimal performance. This paper reports a pre-training-driven approach that leads to snapshot image denoising with substantially improved quality. We demonstrated effective free-space optical denoising by a diffractive network optimized by a two-step process including (1) pre-training using a massive dataset of 3.45 million diverse but simple images and (2) fine-tuning with the corresponding task-specific datasets. Compared to conventional Fourier-domain filtering and directly trained diffractive networks, such a transfer learning process exhibited prominent advantages for denoising images degraded by severe noise, peak signal-to-noise ratio (PSNR) below 8 dB, while preserving fine image features and improving the PSNR to above 18 dB. Importantly, the same pre-trained optical network could be consistently fine-tuned to process degraded images from highly diverse styles ranging from handwritten digits (MNIST) and chest X-rays (ChestMNIST) to CIFAR-10 images and human faces (CelebA). We further demonstrated the critical role of our optical denoisers in vision-based applications, including face detection, plate recognition, and localization of UAVs in noisy conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a two-step pre-training and fine-tuning strategy for diffractive optical neural networks to perform all-optical image denoising in free space. Pre-training is performed on 3.45 million simple images, followed by fine-tuning on task-specific datasets including MNIST, ChestMNIST, CIFAR-10, and CelebA. The approach is reported to achieve substantial PSNR improvements from below 8 dB to above 18 dB for severely noisy images, outperforming Fourier-domain filtering and directly trained networks, while preserving fine features, and is applied to downstream vision tasks like face detection and UAV localization.

Significance. If the physical implementation is validated, this result would highlight the potential of transfer learning in optical computing systems, offering a path to more robust and versatile all-optical processors that could operate at high speed and low energy for image processing applications.

major comments (1)
  1. [Abstract] Abstract: The claim of demonstrating effective free-space optical denoising by a diffractive network optimized via the two-step process is not supported by any quantitative hardware validation data, such as side-by-side simulated vs. measured PSNR values, error bars from multiple trials, alignment tolerance analysis, or an error budget for fabrication and unmodeled aberrations. This directly undermines assessment of the central claim that the simulated performance (PSNR rise from <8 dB to >18 dB) translates to physical hardware across the reported datasets.
minor comments (1)
  1. The description of the pre-training dataset composition and the exact fine-tuning hyperparameters could be expanded for reproducibility, though this does not affect the core claims.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim of demonstrating effective free-space optical denoising by a diffractive network optimized via the two-step process is not supported by any quantitative hardware validation data, such as side-by-side simulated vs. measured PSNR values, error bars from multiple trials, alignment tolerance analysis, or an error budget for fabrication and unmodeled aberrations. This directly undermines assessment of the central claim that the simulated performance (PSNR rise from <8 dB to >18 dB) translates to physical hardware across the reported datasets.

    Authors: We agree that the manuscript presents only numerical simulation results and does not include experimental hardware validation data such as measured PSNR values, error bars, or detailed alignment/fabrication error budgets. The work focuses on the algorithmic innovation of pre-training on 3.45 million images followed by fine-tuning, with all quantitative claims (e.g., PSNR rising from below 8 dB to above 18 dB) derived from simulated propagation through the diffractive layers. In the revised manuscript we will explicitly revise the abstract and introduction to state that the demonstrations are numerical simulations of the physical optical model. We will add a dedicated subsection on practical hardware considerations, including a first-order error budget based on typical diffractive optics fabrication tolerances (phase errors ~0.05-0.2 rad, lateral alignment ~5-20 um) and how these would propagate to output PSNR. This revision will clarify the scope of the current claims while preserving the central result on the benefits of the two-step training strategy. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical pre-training and fine-tuning procedure

full rationale

The paper presents an empirical two-step optimization process for a diffractive optical network—pre-training on 3.45 million simple images followed by fine-tuning on domain-specific datasets—without any mathematical derivations, equations, or load-bearing self-citations that reduce the reported PSNR gains (from below 8 dB to above 18 dB) to fitted parameters or prior results by construction. Performance claims rest on simulation results and physical demonstrations that are independent of the training workflow itself; no self-definitional loops, fitted-input predictions, or ansatz smuggling via citation appear in the described method. The approach is therefore self-contained as a standard transfer-learning application to all-optical denoising.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard diffraction physics and the empirical effectiveness of transfer learning when applied to physical optical networks; no new entities are postulated and no free parameters are fitted to the target denoising performance.

axioms (2)
  • standard math Light propagation through diffractive layers can be modeled by the Fresnel diffraction integral or angular-spectrum method.
    Standard assumption in diffractive optics and optical neural network design.
  • domain assumption The physical optical network can be accurately optimized in simulation before fabrication.
    Implicit in all diffractive-network papers; required for the pre-training step to be useful.

pith-pipeline@v0.9.0 · 5551 in / 1455 out tokens · 43598 ms · 2026-05-11T01:50:14.403671+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 1 internal anchor

  1. [1]

    In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp

    Gondara, L.: Medical image denoising using convolutional denoising autoen- coders. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 241–246 (2016). IEEE

  2. [2]

    Scientific reports9(1), 14454 (2019)

    Devalla, S.K., Subramanian, G., Pham, T.H., Wang, X., Perera, S., Tun, T.A., Aung, T., Schmetterer, L., Thi´ ery, A.H., Girard, M.J.: A deep learning approach to denoise optical coherence tomography images of the optic nerve head. Scientific reports9(1), 14454 (2019)

  3. [3]

    Scientific Reports13(1), 5760 (2023)

    Nienhaus, J., Matten, P., Britten, A., Scherer, J., H¨ ock, E., Freytag, A., Drexler, W., Leitgeb, R.A., Schlegl, T., Schmoll, T.: Live 4d-oct denoising with self-supervised deep learning. Scientific Reports13(1), 5760 (2023)

  4. [4]

    In: European Conference on Computer Vision, pp

    Qi, D., Tan, W., Yao, Q., Liu, J.: Yolo5face: Why reinventing a face detector. In: European Conference on Computer Vision, pp. 228–244 (2022). Springer

  5. [5]

    Pattern Recognition155, 110714 (2024)

    Yu, Z., Huang, H., Chen, W., Su, Y., Liu, Y., Wang, X.: Yolo-facev2: A scale and occlusion aware face detector. Pattern Recognition155, 110714 (2024)

  6. [6]

    nature521(7553), 460–466 (2015)

    Floreano, D., Wood, R.J.: Science, technology and the future of small autonomous drones. nature521(7553), 460–466 (2015)

  7. [7]

    Science robotics7(66), 5954 (2022)

    Zhou, X., Wen, X., Wang, Z., Gao, Y., Li, H., Wang, Q., Yang, T., Lu, H., Cao, Y., Xu, C.,et al.: Swarm of micro flying robots in the wild. Science robotics7(66), 5954 (2022)

  8. [8]

    Engineering46, 186–213 (2025)

    Xu, D., Ma, Y., Jin, G., Cao, L.: Intelligent photonics: a disruptive technology to shape the present and redefine the future. Engineering46, 186–213 (2025)

  9. [9]

    eLight5(1), 5 (2025)

    Wu, N., Sun, Y., Hu, J., Yang, C., Bai, Z., Wang, F., Cui, X., He, S., Li, Y., Zhang, C.,et al.: Intelligent nanophotonics: when machine learning sheds light. eLight5(1), 5 (2025)

  10. [10]

    PhotoniX2(1), 5 (2021)

    Liu, J., Wu, Q., Sui, X., Chen, Q., Gu, G., Wang, L., Li, S.: Research progress in optical neural networks: theory, applications and developments. PhotoniX2(1), 5 (2021)

  11. [11]

    Light: Science & Applications 13(1), 263 (2024)

    Fu, T., Zhang, J., Sun, R., Huang, Y., Xu, W., Yang, S., Zhu, Z., Chen, H.: Optical neural networks: progress and challenges. Light: Science & Applications 13(1), 263 (2024)

  12. [12]

    Nature photonics11(7), 441–446 (2017)

    Shen, Y., Harris, N.C., Skirlo, S., Prabhu, M., Baehr-Jones, T., Hochberg, M., Sun, X., Zhao, S., Larochelle, H., Englund, D.,et al.: Deep learning with coherent nanophotonic circuits. Nature photonics11(7), 441–446 (2017)

  13. [13]

    Nature640(8058), 361–367 (2025)

    Hua, S., Divita, E., Yu, S., Peng, B., Roques-Carmes, C., Su, Z., Chen, Z., Bai, 23 Y., Zou, J., Zhu, Y.,et al.: An integrated large-scale photonic accelerator with ultralow latency. Nature640(8058), 361–367 (2025)

  14. [14]

    Nature640(8058), 368–374 (2025)

    Ahmed, S.R., Baghdadi, R., Bernadskiy, M., Bowman, N., Braid, R., Carr, J., Chen, C., Ciccarella, P., Cole, M., Cooke, J.,et al.: Universal photonic artificial intelligence acceleration. Nature640(8058), 368–374 (2025)

  15. [15]

    Nature645(8080), 354–361 (2025)

    Kalinin, K.P., Gladrow, J., Chu, J., Clegg, J.H., Cletheroe, D., Kelly, D.J., Rah- mani, B., Brennan, G., Canakci, B., Falck, F.,et al.: Analog optical computer for ai inference and combinatorial optimization. Nature645(8080), 354–361 (2025)

  16. [16]

    Nature Communications14(1), 70 (2023)

    Fu, T., Zang, Y., Huang, Y., Du, Z., Huang, H., Hu, C., Chen, M., Yang, S., Chen, H.: Photonic machine learning with on-chip diffractive optics. Nature Communications14(1), 70 (2023)

  17. [17]

    ACS Photonics10(7), 2001–2010 (2023)

    Liao, K., Dai, T., Yan, Q., Hu, X., Gong, Q.: Integrated photonic neural networks: Opportunities and challenges. ACS Photonics10(7), 2001–2010 (2023)

  18. [18]

    Science 361(6406), 1004–1008 (2018)

    Lin, X., Rivenson, Y., Yardimci, N.T., Veli, M., Luo, Y., Jarrahi, M., Ozcan, A.: All-optical machine learning using diffractive deep neural networks. Science 361(6406), 1004–1008 (2018)

  19. [19]

    Nature Photonics15(5), 367–373 (2021)

    Zhou, T., Lin, X., Wu, J., Chen, Y., Xie, H., Li, Y., Fan, J., Wu, H., Fang, L., Dai, Q.: Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nature Photonics15(5), 367–373 (2021)

  20. [20]

    elight2(1), 4 (2022)

    Luo, Y., Zhao, Y., Li, J., C ¸ etinta¸ s, E., Rivenson, Y., Jarrahi, M., Ozcan, A.: Computational imaging without a computer: seeing through random diffusers at the speed of light. elight2(1), 4 (2022)

  21. [21]

    Light: Science & Applications13(1), 43 (2024)

    I¸ sıl, C ¸ ., Gan, T., Ardic, F.O., Mentesoglu, K., Digani, J., Karaca, H., Chen, H., Li, J., Mengu, D., Jarrahi, M.,et al.: All-optical image denoising using a diffractive visual processor. Light: Science & Applications13(1), 43 (2024)

  22. [22]

    eLight4(1), 8 (2024)

    Hu, J., Liao, K., Din¸ c, N.U., Gigli, C., Bai, B., Gan, T., Li, X., Chen, H., Yang, X., Li, Y.,et al.: Subwavelength imaging using a solid-immersion diffractive optical processor. eLight4(1), 8 (2024)

  23. [23]

    Science, 9404 (2026)

    Guo, Y., Zhang, H., Li, M., Yu, F., Wu, Y., Hao, Y., Huang, S., Liang, Y., Lin, X., Li, X., et al.: Deeper detection limits in astronomical imaging using self-supervised spatiotemporal denoising. Science, 9404 (2026)

  24. [24]

    Nature Photonics19(5), 486–493 (2025) 24

    Yu, H., Huang, Z., Lamon, S., Wang, B., Ding, H., Lin, J., Wang, Q., Luan, H., Gu, M., Zhang, Q.: All-optical image transportation through a multimode fibre using a miniaturized diffractive neural network on the distal facet. Nature Photonics19(5), 486–493 (2025) 24

  25. [25]

    Light: Science & Applications15(1), 101 (2026)

    Huang, Z., Liu, Y., Zhang, N., Zhang, Z., Liao, Q., He, C., Liu, S., Liu, Y., Wang, H., Qiao, X.,et al.: Anti-interference diffractive deep neural networks for multi-object recognition. Light: Science & Applications15(1), 101 (2026)

  26. [26]

    Opto-Electronic Advances7(2), 230005–1 (2023)

    He, C., Zhao, D., Fan, F., Zhou, H., Li, X., Li, Y., Li, J., Dong, F., Miao, Y.- X., Wang, Y.,et al.: Pluggable multitask diffractive neural networks based on cascaded metasurfaces. Opto-Electronic Advances7(2), 230005–1 (2023)

  27. [27]

    Nature623(7985), 48–57 (2023)

    Chen, Y., Nazhamaiti, M., Xu, H., Meng, Y., Zhou, T., Li, G., Fan, J., Wei, Q., Wu, J., Qiao, F.,et al.: All-analog photoelectronic chip for high-speed vision tasks. Nature623(7985), 48–57 (2023)

  28. [28]

    Nature632(8024), 280–286 (2024)

    Xue, Z., Zhou, T., Xu, Z., Yu, S., Dai, Q., Fang, L.: Fully forward mode training for optical neural networks. Nature632(8024), 280–286 (2024)

  29. [29]

    Nature Communications14(1), 2535 (2023)

    Huo, Y., Bao, H., Peng, Y., Gao, C., Hua, W., Yang, Q., Li, H., Wang, R., Yoon, S.-E.: Optical neural network via loose neuron array and functional learning. Nature Communications14(1), 2535 (2023)

  30. [30]

    Light: Science & Applications13(1), 173 (2024)

    Li, Y., Li, J., Ozcan, A.: Nonlinear encoding in diffractive information processing using linear optical materials. Light: Science & Applications13(1), 173 (2024)

  31. [31]

    Nature Photonics18(10), 1076–1082 (2024)

    Yildirim, M., Dinc, N.U., Oguz, I., Psaltis, D., Moser, C.: Nonlinear processing with linear optics. Nature Photonics18(10), 1076–1082 (2024)

  32. [32]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference, pp

    Zhou, X., Lee, Z., Ye, W., Xie, R., Zhang, W., Peng, G., Li, Z., Zhang, S.: All-optical nonlinear diffractive deep network for ultrafast image denoising. In: Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 28221–28231 (2025)

  33. [33]

    Nature Photonics17(8), 723–730 (2023)

    Chen, Z., Sludds, A., Davis III, R., Christen, I., Bernstein, L., Ateshian, L., Heuser, T., Heermeier, N., Lott, J.A., Reitzenstein, S.,et al.: Deep learning with coherent vcsel neural networks. Nature Photonics17(8), 723–730 (2023)

  34. [34]

    Nature Communications16(1), 10332 (2025)

    Ning, Y.M., Ma, Q., Xiao, Q., Gao, X.X., Wu, Q.W., Gu, Z., Li, R.S., Chen, L., You, J.W., Cui, T.J.: Multilayer nonlinear diffraction neural networks with programmable and fast relu activation function. Nature Communications16(1), 10332 (2025)

  35. [35]

    Nature588(7836), 39–47 (2020)

    Wetzstein, G., Ozcan, A., Gigan, S., Fan, S., Englund, D., Soljaˇ ci´ c, M., Denz, C., Miller, D.A., Psaltis, D.: Inference in artificial intelligence with deep optics and photonics. Nature588(7836), 39–47 (2020)

  36. [36]

    Nature601(7894), 549–555 (2022) 25

    Wright, L.G., Onodera, T., Stein, M.M., Wang, T., Schachter, D.T., Hu, Z., McMahon, P.L.: Deep physical neural networks trained with backpropagation. Nature601(7894), 549–555 (2022) 25

  37. [37]

    Science 382(6676), 1297–1303 (2023)

    Momeni, A., Rahmani, B., Mall´ ejac, M., Del Hougne, P., Fleury, R.: Backpropagation-free training of deep physical neural networks. Science 382(6676), 1297–1303 (2023)

  38. [38]

    Nature Reviews Materials6(3), 207–225 (2021)

    Zangeneh-Nejad, F., Sounas, D.L., Al` u, A., Fleury, R.: Analogue computing with metamaterials. Nature Reviews Materials6(3), 207–225 (2021)

  39. [39]

    Nature645(8079), 53–61 (2025)

    Momeni, A., Rahmani, B., Scellier, B., Wright, L.G., McMahon, P.L., Wanjura, C.C., Li, Y., Skalli, A., Berloff, N.G., Onodera, T.,et al.: Training of physical neural networks. Nature645(8079), 53–61 (2025)

  40. [40]

    nature communications15(1), 1525 (2024)

    Hu, J., Mengu, D., Tzarouchis, D.C., Edwards, B., Engheta, N., Ozcan, A.: Diffractive optical computing in free space. nature communications15(1), 1525 (2024)

  41. [41]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp

    He, K., Girshick, R., Doll´ ar, P.: Rethinking imagenet pre-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4918–4927 (2019)

  42. [42]

    Advances in neural information processing systems 33, 3833–3845 (2020)

    Zoph, B., Ghiasi, G., Lin, T.-Y., Cui, Y., Liu, H., Cubuk, E.D., Le, Q.: Rethinking pre-training and self-training. Advances in neural information processing systems 33, 3833–3845 (2020)

  43. [43]

    In: International Conference on Machine Learning, pp

    Hendrycks, D., Lee, K., Mazeika, M.: Using pre-training can improve model robustness and uncertainty. In: International Conference on Machine Learning, pp. 2712–2721 (2019). PMLR

  44. [44]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., Gao, W.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299– 12310 (2021)

  45. [45]

    In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp

    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large- scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). Ieee

  46. [46]

    In: International Conference on Learning Representations (2018)

    Ha, D., Eck, D.: A neural representation of sketch drawings. In: International Conference on Learning Representations (2018)

  47. [47]

    In: 2017 International Joint Conference on Neural Networks (IJCNN), pp

    Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: Emnist: Extending mnist to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2921–2926 (2017). IEEE

  48. [48]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017) 26

  49. [49]

    In: European Conference on Computer Vision, pp

    Zhang, Y., Yin, Z., Li, Y., Yin, G., Yan, J., Shao, J., Liu, Z.: Celeba-spoof: Large- scale face anti-spoofing dataset with rich annotations. In: European Conference on Computer Vision, pp. 70–85 (2020). Springer

  50. [50]

    https://github

    L, S.: China-Balanced-License-Plate-Recognition-Dataset-330k. https://github. com/SunlifeV/CBLPRD-330k (2023)

  51. [51]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Deng, J., Guo, J., Ververas, E., Kotsia, I., Zafeiriou, S.: Retinaface: Single- shot multi-level face localisation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5203–5212 (2020)

  52. [52]

    IEEE transactions on intelligent transportation systems24(5), 5172–5185 (2023)

    Ke, X., Zeng, G., Guo, W.: An ultra-fast automatic license plate recogni- tion approach for unconstrained scenarios. IEEE transactions on intelligent transportation systems24(5), 5172–5185 (2023)

  53. [53]

    In: 2011 IEEE International Conference on Robotics and Automation, pp

    Olson, E.: Apriltag: A robust and flexible visual fiducial system. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3400–3407 (2011). IEEE

  54. [54]

    http://www.cs

    Krizhevsky, A., Nair, V., Hinton, G.: The CIFAR-10 Dataset. http://www.cs. toronto.edu/∼kriz/cifar.html

  55. [55]

    Scientific data10(1), 41 (2023)

    Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pfister, H., Ni, B.: Medm- nist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification. Scientific data10(1), 41 (2023)

  56. [56]

    IEEE transactions on image processing13(4), 600–612 (2004)

    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assess- ment: from error visibility to structural similarity. IEEE transactions on image processing13(4), 600–612 (2004)

  57. [57]

    Advances in Neural Information Processing Systems37, 76232–76264 (2024)

    H¨ agele, A., Bakouch, E., Kosson, A., Allal, L.B., Von Werra, L., Jaggi, M.: Scaling laws and compute-optimal training beyond fixed training durations. Advances in Neural Information Processing Systems37, 76232–76264 (2024)

  58. [58]

    In: The Twelfth International Conference on Learning Representations (2023)

    Liu, H., Li, Z., Hall, D.L.W., Liang, P., Ma, T.: Sophia: A scalable stochas- tic second-order optimizer for language model pre-training. In: The Twelfth International Conference on Learning Representations (2023)

  59. [59]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Wortsman, M., Ilharco, G., Kim, J.W., Li, M., Kornblith, S., Roelofs, R., Lopes, R.G., Hajishirzi, H., Farhadi, A., Namkoong, H.,et al.: Robust fine-tuning of zero-shot models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7959–7971 (2022) 27