pith. sign in

arxiv: 2606.09792 · v1 · pith:RZZVL3LHnew · submitted 2026-06-08 · 💻 cs.CV

End-to-End Optimization of Incoherent Imaging for Classification Under Detector-Limited Readout

Pith reviewed 2026-06-27 17:02 UTC · model grok-4.3

classification 💻 cs.CV
keywords incoherent imagingphase maskend-to-end optimizationmutual information bounddetector readoutclassificationmetasurfacespatial frequency
0
0 comments X

The pith

No incoherent phase mask exceeds the ideal-channel mutual information between detector measurements and class labels; a conventional lens approaches this ceiling and joint optimization yields no gain under full readout.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that for object classification with incoherent imaging, end-to-end optimization of a phase mask and neural network produces no performance improvement over a conventional focusing lens whenever the detector provides full readout. This follows from a proof that no phase mask can surpass the mutual information of the ideal imaging channel to the class labels. Gains from co-design appear only when readout is constrained by coarse spatial sampling or a limited number of measurements, because the optics can then increase class separability in the reduced data. These advantages are largest at low detector noise and when discriminative content lies at lower spatial frequencies than within-class variation. The result clarifies the conditions under which metasurface co-optimization is worthwhile versus standard optics.

Core claim

Under full detector readout, no incoherent phase mask can exceed the ideal-channel mutual information between measurements and class labels; a conventional focusing lens approaches this upper bound, and joint optimization of mask and network yields no empirical gain. When readout is constrained by coarse sampling or few measurements, optimized optics improve classification accuracy by raising class separability in the detector data. These gains shrink with rising detector noise, since the mask shapes the signal before noise is added and cannot remove post-detection noise. The benefit is also largest when class-discriminative spectral content is concentrated at lower spatial frequencies than

What carries the argument

The ideal-channel mutual information bound between detector measurements and class labels under incoherent imaging; it functions as a provable upper limit that no phase mask can surpass, thereby explaining the absence of gains from joint optimization under full readout.

If this is right

  • Under full readout a conventional lens suffices because it approaches the mutual-information ceiling.
  • Optimized phase masks raise class separability only when readout is limited by coarse sampling or few measurements.
  • Gains shrink as detector noise increases because optics act before noise addition.
  • Co-design helps most when class-discriminative content lies at lower spatial frequencies than within-class variation.
  • The same distinctions hold on both synthetic data and standard image benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The bound implies that, for full-readout classification, engineering effort should shift from optics to detector design or noise reduction.
  • If the physical system deviates from the assumed incoherent forward model, the mutual-information ceiling may not apply.
  • The framework could be tested on detection or segmentation tasks to check whether readout constraints similarly limit optics gains.
  • The spectral-frequency dependence suggests a simple pre-screening test: measure the power spectra of inter-class versus intra-class differences before deciding on co-design.

Load-bearing premise

The analysis assumes that detector noise is added after the optics and cannot be mitigated by the phase mask, and that the forward model of incoherent imaging accurately represents the physical system.

What would settle it

An experiment in which an optimized phase mask achieves strictly higher mutual information to the labels than the ideal-channel bound under full detector readout, or in which joint optimization produces statistically significant accuracy gains over a lens in the full-readout regime.

Figures

Figures reproduced from arXiv: 2606.09792 by Archer Wang, Joshua Chen, Marin Solja\v{c}i\'c, Sachin Vaidya.

Figure 2
Figure 2. Figure 2: Detector divided into k×k blocks, which may correspond to physical pixel regions. Within each block, the pale yellow square denotes a readout region of side length (s); intensities in this region are summed to produce one measurement per block. The top row shows the mask (W) on a single block for different (s). The bottom row shows the resulting detector readout (DW ) applied across the full detector. the … view at source ↗
Figure 4
Figure 4. Figure 4: Effect of detector noise on binary masked-readout classification. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: Binary classification under detector-limited readout (noiseless). (a) Class means µ0 and µ1 in the synthetic Gaussian experiment. (b) Block￾sum readout sweep: detector resolution N′ is varied, with each detector pixel summing a disjoint k × k block. Left: separability proxy d 2 . Right: test accuracy. (c) Masked-readout sweep: detector resolution is fixed and the within-block readout size s × s is varied. … view at source ↗
Figure 5
Figure 5. Figure 5: (a) Radially averaged MTF and power-spectrum comparison for [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Frequency-shifted binary synthetic controls under the same masked [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Masked-readout classification on MNIST, FashionMNIST, and SVHN. (a) Test accuracy as a function of within-block readout side length s for MNIST, FashionMNIST, and SVHN. For each dataset, the detector is partitioned into non-overlapping k × k blocks with N′ = 4 (k = 256, N = 1024), and the readout sums a centered s × s region within each block. A fixed conventional focusing lens with a trained neural-networ… view at source ↗
Figure 8
Figure 8. Figure 8: Accuracy versus within-block readout side length [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: MNIST masked-readout sweep at detector resolution [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
read the original abstract

End-to-end co-optimization of optical front-ends (e.g. metasurfaces) and neural network back-ends has been widely applied to imaging tasks, yet a formalism characterizing when and why such systems outperform conventional lens-based imaging is largely lacking. This paper focuses on object classification, a central imaging task, and asks when end-to-end optimization of a phase mask for incoherent imaging improves performance over a conventional focusing lens. We find that these gains arise primarily under constrained detector readout and are limited under full detector readout. In the latter setting, we prove that no incoherent phase mask exceeds the ideal-channel mutual information between detector measurements and class labels; a conventional focusing lens approaches this ceiling, and joint optimization yields no empirical gain. When detector readout is constrained -- by coarse spatial sampling or a limited number of measurements -- optimized optics can substantially improve classification by increasing class separability in the detector measurements. These gains are largest under low detector noise and shrink as noise grows, because the optics shape the signal before it reaches the detector but cannot remove noise added afterward. The advantage also depends on the spectral structure of the task: co-design helps most when class-discriminative content is concentrated at lower spatial frequencies than within-class variation. We develop a theoretical framework formalizing these distinctions and test its predictions on synthetic data and standard benchmarks (MNIST, FashionMNIST, SVHN).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that end-to-end co-optimization of an incoherent phase mask and neural network backend for object classification yields no benefit under full detector readout, because no phase mask can exceed the mutual information of the ideal channel (which a conventional lens approaches); under constrained readout (coarse sampling or limited measurements), optimized optics improve class separability, with gains largest at low noise and when class-discriminative content is at lower spatial frequencies than within-class variation. A theoretical framework is developed and tested on synthetic data plus MNIST, FashionMNIST, and SVHN.

Significance. If the central claims hold, the work supplies a clear formalism distinguishing when optics-computation co-design is useful versus redundant for classification, with the mutual-information upper bound and the spectral-structure condition as notable contributions. The explicit dependence on readout constraints and post-optics noise is a useful practical takeaway, and the use of public benchmarks aids reproducibility.

major comments (2)
  1. [Theory section deriving the MI bound] The MI bound (abstract and theory section) is derived under the model where the phase mask shapes intensity via the incoherent PSF before additive detector noise is applied. This premise is load-bearing for the claim that 'no incoherent phase mask exceeds the ideal-channel mutual information'; if physical noise (e.g., Poisson) occurs on the intensity before or during propagation, or if the forward model omits non-shift-invariant effects, the inequality may not hold and a phase mask could still improve separability.
  2. [Empirical evaluation under full readout] The statement that 'a conventional focusing lens approaches this ceiling' (abstract) requires quantitative support: the manuscript should report the numerical gap between the lens MI and the ideal-channel bound on the same datasets used for the empirical tests.
minor comments (2)
  1. [Abstract] The abstract lists benchmarks but omits class counts, image resolutions, and any preprocessing; these details should be stated explicitly for reproducibility.
  2. [Theory section] Notation for the incoherent PSF and the ideal channel should be introduced with a single equation reference rather than scattered across paragraphs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the scope of our theoretical claims and strengthen the empirical presentation. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Theory section deriving the MI bound] The MI bound (abstract and theory section) is derived under the model where the phase mask shapes intensity via the incoherent PSF before additive detector noise is applied. This premise is load-bearing for the claim that 'no incoherent phase mask exceeds the ideal-channel mutual information'; if physical noise (e.g., Poisson) occurs on the intensity before or during propagation, or if the forward model omits non-shift-invariant effects, the inequality may not hold and a phase mask could still improve separability.

    Authors: Our analysis is developed under the standard model of incoherent imaging (phase mask applied to the object via the PSF) followed by additive post-detection noise. This models common detector readout noise and is the setting in which the mutual-information upper bound holds. We will revise the manuscript to state this modeling assumption explicitly in the theory section and to discuss its implications, including that pre-propagation Poisson noise or non-shift-invariant aberrations would require a separate analysis. The bound and the conclusion that no phase mask exceeds the ideal channel are therefore scoped to the stated forward model. revision: partial

  2. Referee: [Empirical evaluation under full readout] The statement that 'a conventional focusing lens approaches this ceiling' (abstract) requires quantitative support: the manuscript should report the numerical gap between the lens MI and the ideal-channel bound on the same datasets used for the empirical tests.

    Authors: We agree that reporting the numerical gap will make the claim more precise. In the revised manuscript we will add a table (or figure panel) showing the estimated mutual information achieved by the conventional lens versus the ideal-channel bound for MNIST, FashionMNIST, and SVHN under full readout, using the same estimation procedure employed elsewhere in the paper. revision: yes

Circularity Check

0 steps flagged

No circularity: MI bound follows from standard information theory on stated model

full rationale

The paper's central proof states that no incoherent phase mask can exceed the ideal-channel mutual information I(detector measurements; labels) under the explicit model of incoherent PSF shaping followed by additive detector noise. This follows directly from the data-processing inequality and the fact that the phase mask cannot alter post-optics noise statistics; the derivation uses textbook information-theoretic arguments rather than any fitted parameter, self-citation chain, or ansatz imported from prior author work. No equation reduces to a tautology or renames a fitted quantity as a prediction. Empirical sections rely on public datasets (MNIST, FashionMNIST, SVHN) without self-referential fitting loops. The assumption that noise is strictly post-optics is a modeling premise, not a circularity in the derivation itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard definitions from information theory and linear imaging models; no new free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (2)
  • standard math Mutual information is defined in the standard way between random variables representing class labels and detector measurements.
    Used to establish the performance ceiling under full readout.
  • domain assumption Incoherent imaging is modeled as a linear intensity mapping followed by additive detector noise.
    Underpins the claim that optics cannot remove post-detection noise.

pith-pipeline@v0.9.1-grok · 5790 in / 1395 out tokens · 29345 ms · 2026-06-27T17:02:27.551452+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 15 canonical work pages

  1. [1]

    D. Guo, S. Shamai, and S. Verd\'u, ``Mutual information and minimum mean-square error in Gaussian channels,'' IEEE Trans. Inf. Theory, vol. 51, no. 4, pp. 1261--1282, Apr. 2005

  2. [2]

    R. M. Fano, Transmission of Information: A Statistical Theory of Communications. Cambridge, MA: MIT Press, 1961

  3. [3]

    G. Arya, W. F. Li, C. Roques-Carmes, M. Solja c i\' c , S. G. Johnson, and Z. Lin. End-to-End Optimization of Metasurfaces for Imaging with Compressed Sensing . ACS Photonics, 11(5):2077--2087, 2024. https://doi.org/10.1021/acsphotonics.4c00259

  4. [4]

    Fisher, G

    S. Fisher, G. Arya, A. Majumdar, Z. Lin, and S. G. Johnson, ``End-to-end metasurface design for temperature imaging via broadband Planck-radiation regression,'' Advanced Optical Materials, vol. 13, no. 9, 2025

  5. [5]

    Molesky, Z

    S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vuckovic, and A. W. Rodriguez, ``Inverse design in nanophotonics,'' Nature Photonics, vol. 12, no. 11, pp. 659--670, 2018

  6. [6]

    A. Y. Piggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and J. Vuckovic, ``Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer,'' Nature Photonics, vol. 9, no. 6, pp. 374--377, 2015

  7. [7]

    J. Chen, S. Vaidya, S. Pajovic, S. Choi, W. Michaels, L. Martin-Monier, J. Hu, C. Cogswell, C. Roques-Carmes, and M. Solja c i\' c . Wavefront Engineering for Scintillation-Based Imaging . ACS Photonics, 2026. https://doi.org/10.1021/acsphotonics.5c03124

  8. [8]

    Y. Baek, B. Bae, H. Shin, C. Sonnadara, H. Cho, C.-Y. Lin, Y. Mu, C. Shen, S. Shah, G. Wang, and K. Lee, ``Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,'' npj Unconventional Computing, vol. 2, art. 25, 2025, doi: 10.1038/s44335-025-00040-6

  9. [9]

    Choi and A

    M. Choi and A. Majumdar, ``Free-space optical encoder for computer vision,'' npj Nanophotonics, vol. 2, art. 36, 2025, doi: 10.1038/s44310-025-00082-5

  10. [10]

    Gehrig and D

    D. Gehrig and D. Scaramuzza, ``Low-latency automotive vision with event cameras,'' Nature, vol. 629, no. 8014, pp. 1034--1040, 2024, doi: 10.1038/s41586-024-07409-w

  11. [11]

    G. M. Gibson, S. D. Johnson, and M. J. Padgett, ``Single-pixel imaging 12 years on: a review,'' Optics Express, vol. 28, no. 19, pp. 28190--28208, 2020, doi: 10.1364/OE.403195

  12. [12]

    R. I. Stantchev, X. Yu, T. Blu, and E. Pickwell-MacPherson, ``Real-time terahertz imaging with a single-pixel detector,'' Nature Communications, vol. 11, art. 2535, 2020, doi: 10.1038/s41467-020-16370-x

  13. [13]

    E. N. Malamas, E. G. M. Petrakis, M. Zervakis, L. Petit, and J.-D. Legat, ``A survey on industrial vision systems, applications and tools,'' Image and Vision Computing, vol. 21, no. 2, pp. 171--188, 2003, doi: 10.1016/S0262-8856(02)00152-X

  14. [14]

    Golnabi and A

    H. Golnabi and A. Asadpour, ``Design and application of industrial machine vision systems,'' Robotics and Computer-Integrated Manufacturing, vol. 23, no. 6, pp. 630--637, 2007, doi: 10.1016/j.rcim.2007.02.005

  15. [15]

    Sitzmann, S

    V. Sitzmann, S. Diamond, Y. Peng, X. Dun, S. Boyd, W. Heidrich, F. Heide, and G. Wetzstein, ``End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging,'' ACM Trans. Graph., vol. 37, no. 4, 2018

  16. [16]

    Tseng, A

    E. Tseng, A. Mosleh, F. Mannan, K. St-Arnaud, A. Sharma, Y. Peng, A. Braun, D. Nowrouzezahrai, J.-F. Lalonde, and F. Heide, ``Differentiable compound optics and processing pipeline optimization for end-to-end camera design,'' ACM Trans. Graph., vol. 40, no. 2, 2021

  17. [17]

    Colburn, A

    S. Colburn, A. Zhan, and A. Majumdar, ``Metasurface optics for full-color computational imaging,'' Science Advances, vol. 4, no. 2, 2018

  18. [18]

    S. Min, S. Choi, S. Pajovic, S. Vaidya, N. Rivera, S. Fan, M. Solja c i\'c, and C. Roques-Carmes, ``End-to-end design of multicolor scintillators for enhanced energy resolution in X-ray imaging,'' Light: Science & Applications, vol. 14, no. 1, p. 158, 2025. doi: 10.1038/s41377-025-01836-8 https://doi.org/10.1038/s41377-025-01836-8

  19. [19]

    Tseng, S

    E. Tseng, S. Colburn, J. Whitehead, L. Huang, S.-H. Baek, A. Majumdar, and F. Heide, ``Neural nano-optics for high-quality thin lens imaging,'' Nature Communications, vol. 12, no. 1, p. 6493, 2021

  20. [20]

    X. Lin, Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, ``All-optical machine learning using diffractive deep neural networks,'' Science, vol. 361, no. 6406, pp. 1004--1008, 2018

  21. [21]

    Y. Luo, D. Mengu, N. T. Yardimci, Y. Rivenson, M. Veli, M. Jarrahi, and A. Ozcan, ``Design of task-specific optical systems using broadband diffractive neural networks,'' Light: Science & Applications, vol. 8, no. 1, p. 112, 2019

  22. [22]

    Colburn, Y

    S. Colburn, Y. Chu, E. Shilzerman, and A. Majumdar, ``Optical frontend for a convolutional neural network,'' Applied Optics, vol. 58, no. 12, pp. 3179--3186, 2019

  23. [23]

    Pinkard, L

    H. Pinkard, L. Kabuli, E. Markley, T. Chien, J. Jiao, and L. Waller. Information-driven design of imaging systems . arXiv:2405.20559 [physics.optics], 2025

  24. [24]

    L. A. Kabuli, H. Pinkard, E. Markley, C. S. Hung, and L. Waller. Designing lensless imaging systems to maximize information capture . Optica, 13:227--235, 2026

  25. [25]

    Markley, H

    E. Markley, H. Pinkard, L. Kabuli, N. Singh, and L. Waller. Computationally Efficient Information-Driven Optical Design with Interchanging Optimization . arXiv:2507.07789 [eess.IV], 2025

  26. [26]

    Hamerly, J

    R. Hamerly, J. R. Basani, A. Sludds, S. K. Vadlamani, and D. Englund, ``Toward the information-theoretic limit of programmable photonics,'' APL Photonics, vol. 10, no. 11, 2025

  27. [27]

    B. W. Brunton, S. L. Brunton, J. L. Proctor, and J. N. Kutz, ``Optimal Sensor Placement and Enhanced Sparsity for Classification,'' arXiv preprint arXiv:1310.4217, 2013. Available: https://arxiv.org/abs/1310.4217

  28. [28]

    Mennel, D

    L. Mennel, D. K. Polyushkin, D. Kwak, et al., ``Sparse pixel image sensor,'' Scientific Reports, vol. 12, art. 5650, 2022. doi: 10.1038/s41598-022-09594-y

  29. [29]

    J. J. Jaeger et al., ``A sparse data scan circuit for pixel detector readout,'' IEEE Transactions on Nuclear Science, vol. 41, no. 3, pt. 2, Jun. 1994. doi: 10.1109/23.299813

  30. [30]

    J. N. Kutz, Data-Driven Modeling & Scientific Computation: Methods for Complex Systems & Big Data. Oxford University Press, 2013

  31. [31]

    J. W. Goodman, Introduction to Fourier Optics, 4th ed. New York, NY, USA: W. H. Freeman and Company, 2017

  32. [32]

    Born and E

    M. Born and E. Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light, 7th ed. Cambridge, U.K.: Cambridge University Press, 1999

  33. [33]

    Bounds on mutual information of mixture data for classification tasks,

    Y. Ding and A. Ashok, "Bounds on mutual information of mixture data for classification tasks," J. Opt. Soc. Am. A 39, 1160--1171 (2022)

  34. [34]

    S. M. Kay, Fundamentals of Statistical Signal Processing, Volume II: Detection Theory. Upper Saddle River, NJ, USA: Prentice Hall, 1998

  35. [35]

    H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I. New York, NY, USA: Wiley, 2001

  36. [36]

    C. K. Abbey and M. P. Eckstein, ``Classification images for simple detection and discrimination tasks in correlated noise,'' J. Opt. Soc. Am. A, vol. 24, no. 12, pp. B110--B124, Dec. 2007

  37. [37]

    Modelling the power spectra of natural images: Statistics and information,

    A. van der Schaaf and J. H. van Hateren, “Modelling the power spectra of natural images: Statistics and information,” Vision Research, vol. 36, no. 17, pp. 2759--2770, 1996

  38. [38]

    Yedidia, C

    A. Yedidia, C. Thrampoulidis, and G. Wornell, ``Analysis and optimization of aperture design in computational imaging,'' in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), 2018, pp. 4029--4033

  39. [39]

    Oliva and A

    A. Oliva and A. Torralba, ``Modeling the shape of the scene: A holistic representation of the spatial envelope,'' International Journal of Computer Vision, vol. 42, no. 3, pp. 145--175, 2001, doi: 10.1023/A:1011139631724

  40. [40]

    Torralba and A

    A. Torralba and A. Oliva, ``Statistics of natural image categories,'' Network: Computation in Neural Systems, vol. 14, no. 3, pp. 391--412, 2003, doi: 10.1088/0954-898X/14/3/302

  41. [41]

    C. A. Collin and P. A. McMullen, ``Subordinate-level categorization relies on high spatial frequencies to a greater degree than basic-level categorization,'' Perception & Psychophysics, vol. 67, no. 2, pp. 354--364, 2005, doi: 10.3758/BF03206498

  42. [42]

    P. A. Lachenbruch, Discriminant Analysis. New York: Hafner Press, 1975

  43. [43]

    W. R. Klecka, Discriminant Analysis, Quantitative Applications in the Social Sciences Series, no. 19. Thousand Oaks, CA, USA: Sage Publications, 1980

  44. [44]

    V. I. Bogachev, Measure Theory, vol. I. Berlin, Heidelberg, New York: Springer-Verlag, 2007

  45. [45]

    R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 2012

  46. [46]

    C. M. Bishop, Pattern Recognition and Machine Learning. Springer, New York, 2006

  47. [47]

    Etemad and R

    K. Etemad and R. Chellappa, ``Discriminant analysis for recognition of human face images,'' Journal of the Optical Society of America A, vol. 14, no. 8, pp. 1724--1733, 1997

  48. [48]

    H. Gan, N. Sang, and R. Huang, ``Self-training-based face recognition using semi-supervised linear discriminant analysis and affinity propagation,'' Journal of the Optical Society of America A, vol. 31, pp. 1--6, 2014

  49. [49]

    Goudail, P

    F. Goudail, P. R\'efr\'egier, and G. Delyon, ``Bhattacharyya distance as a contrast parameter for statistical processing of noisy optical images,'' J. Opt. Soc. Am. A, vol. 21, no. 7, pp. 1231--1240, 2004

  50. [50]

    Nielsen, ``Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means,'' Pattern Recognition Letters, vol

    F. Nielsen, ``Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means,'' Pattern Recognition Letters, vol. 42, pp. 25--34, 2014

  51. [51]

    Matsushima and T

    K. Matsushima and T. Shimobaba, ``Band-Limited Angular Spectrum Method for Numerical Simulation of Free-Space Propagation in Far and Near Fields,'' Opt. Express, vol. 17, pp. 19662--19673, 2009

  52. [52]

    J. R. Janesick, Scientific Charge-Coupled Devices . SPIE Press, Bellingham, WA, 2001

  53. [53]

    H. H. Hopkins, ``The frequency response of a defocused optical system,'' Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, vol. 231, no. 1184, pp. 91--103, 1955

  54. [54]

    LeCun, C

    Y. LeCun, C. Cortes, and C. J. C. Burges. The MNIST database of handwritten digits . Available at http://yann.lecun.com/exdb/mnist/. 1998

  55. [55]

    Deng, ``The MNIST database of handwritten digit images for machine learning research,'' IEEE Signal Processing Magazine, vol

    L. Deng, ``The MNIST database of handwritten digit images for machine learning research,'' IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141--142, Nov. 2012

  56. [56]

    H. Xiao, K. Rasul, and R. Vollgraf. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms . arXiv:1708.07747 [cs.LG], 2017

  57. [57]

    I. J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud, and V. Shet. Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks . arXiv:1312.6082 [cs.CV], 2014

  58. [58]

    Shastri and F

    K. Shastri and F. Monticone, ``Nonlocal flat optics,'' Nature Photonics, vol. 17, no. 1, pp. 36--47, Dec. 2022