Enhancing Gravitational Lens Study with Deep Learning: A Study on Effects of Dropout Regularization
Pith reviewed 2026-05-15 15:06 UTC · model grok-4.3
The pith
Dropout in CNNs cuts errors in strong-lens SIE parameter inference by 60-76 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that dropout regularization is critical for enhancing the precision and robustness of estimated SIE parameters from synthetic galaxy-galaxy lens systems, as demonstrated by 4-fold cross-validation: dropout configurations yield R² values up to approximately 0.96 for most parameters, mean Peak Signal-to-Noise Ratios up to approximately 37 dB, and relative errors reduced by 60-76 percent to at most approximately 9 percent at the 90 percent confidence level for the majority of parameters.
What carries the argument
Modified AlexNet convolutional neural network with three distinct dropout configurations, trained to regress four SIE parameters (Einstein radius, axis ratio, and two ellipticity components) from synthetic strong-lensing images.
If this is right
- Dropout is necessary to achieve high precision and robustness in the inferred SIE parameters.
- Deep learning with dropout enables scalable, computationally efficient modeling of large numbers of strong-lensing systems.
- Parameter errors remain at most approximately 9 percent at 90 percent for most parameters when dropout is used.
- The approach supports high-precision inference of galaxy mass distributions from lensing data.
Where Pith is reading between the lines
- If the synthetic training distribution matches real data, the same architecture could supply rapid initial parameter estimates that speed up full modeling of observed lenses.
- Higher parameter accuracy could tighten statistical constraints on dark-matter halo properties derived from lens samples.
- Applying the model to real survey data would reveal how domain shift between simulations and observations limits performance.
Load-bearing premise
The synthetic images generated from the China Space Station Telescope catalog and the Singular Isothermal Ellipsoid profile faithfully represent the statistical properties of real observed strong-lensing systems.
What would settle it
Running the trained model on a set of real observed strong-lensing systems and comparing the inferred SIE parameters against independent measurements obtained from traditional lens-modeling software would test whether the reported error reductions hold outside the simulated domain.
Figures
read the original abstract
Strong gravitational lensing provides valuable insights into the mass distribution of galaxies and the nature of dark matter. However, its modeling is computationally demanding due to the large volume of strong lensing observations. In this work, we explore the application of Convolutional Neural Networks to infer physical parameters from simulated galaxy-galaxy lens systems, described by the Singular Isothermal Ellipsoid (SIE) profile for the galaxy lens. We construct a dataset of 76,396 synthetic lensing images derived from the China Space Station Telescope catalog and employ it to train a modified CNN model, based on AlexNet architecture, to predict four key SIE parameters, Einstein radius, axis ratio and ellipticity components. We analyze the network performance under three distinct dropout configurations to quantify their influence on generalization and parameter inference accuracy. The results indicate that the incorporation of dropout is critical for enhancing the precision and robustness of the estimated parameters, as demonstrated using a 4-fold cross-validation procedure. When dropout tools are included we obtain yields coefficients of determination up to $R^2 \sim 0.96$ for most SIE parameters and mean Peak Signal-to-Noise Ratios of up to $\sim 37$ dB. Relative to the configuration without dropout, the use of dropout reduces the relative errors in the inferred SIE parameters by approximately $60-76\%$, resulting in errors of at most $\sim 9\%$ at the $90\%$ confidence level for the majority of parameters. These findings highlight the potential of deep learning approaches to enable scalable, computationally efficient, and high-precision modeling of strong gravitational lensing systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes the use of a modified AlexNet convolutional neural network to predict four Singular Isothermal Ellipsoid (SIE) lens parameters (Einstein radius, axis ratio, and ellipticity components) from 76,396 simulated galaxy-galaxy strong lensing images generated from the China Space Station Telescope catalog. It evaluates three dropout configurations via 4-fold cross-validation on held-out synthetic data, claiming that dropout yields R² up to ~0.96, mean PSNR up to ~37 dB, and reduces relative parameter errors by 60-76% (to ≤9% at 90% CL) relative to the no-dropout baseline.
Significance. If the results hold under the stated assumptions, the work provides quantitative evidence that dropout regularization improves generalization and accuracy in CNN-based inference of SIE parameters on simulated strong-lensing data. This could support scalable analysis pipelines for large upcoming surveys such as CSST, where traditional modeling is computationally intensive. The reported use of 4-fold CV and standard metrics (R², relative error, PSNR) on independent validation folds strengthens the internal validity of the dropout-benefit claim within the simulation framework.
major comments (3)
- The central performance claims (R² ∼ 0.96, 60-76% error reduction) rest entirely on synthetic images generated under the SIE profile assumption from the CSST catalog; no transfer tests to real observed lenses or to non-SIE profiles (e.g., NFW or composite models) are reported, which is load-bearing for the broader claim of enhancing gravitational lens studies.
- Insufficient detail is provided on the exact modifications to the AlexNet architecture, the specific dropout probabilities and placement in the three configurations, and the full data-generation pipeline (noise model, source properties, and image rendering steps), preventing independent reproduction and verification of the reported gains.
- No baseline comparisons to traditional lens-modeling codes or to other machine-learning approaches are included, making it difficult to quantify the practical advantage of the dropout-enhanced CNN over existing methods.
minor comments (2)
- Clarify the phrase 'dropout tools' in the abstract to 'dropout layers' or 'dropout regularization' for precision.
- Ensure that all reported confidence intervals (e.g., 90% CL) are accompanied by explicit descriptions of how they were computed from the 4-fold CV folds.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which have helped us identify areas to strengthen the manuscript. We address each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: The central performance claims (R² ∼ 0.96, 60-76% error reduction) rest entirely on synthetic images generated under the SIE profile assumption from the CSST catalog; no transfer tests to real observed lenses or to non-SIE profiles (e.g., NFW or composite models) are reported, which is load-bearing for the broader claim of enhancing gravitational lens studies.
Authors: We agree that the study is restricted to simulated SIE-profile images and does not include transfer tests on real lenses or alternative profiles such as NFW. This is a genuine limitation of the current work, which was designed to isolate the effect of dropout within a controlled simulation framework. In the revised manuscript we will add an explicit Limitations and Future Work subsection that states the scope of the claims, notes the absence of real-data validation, and outlines the need for subsequent studies on observed lenses and more complex mass models. revision: yes
-
Referee: Insufficient detail is provided on the exact modifications to the AlexNet architecture, the specific dropout probabilities and placement in the three configurations, and the full data-generation pipeline (noise model, source properties, and image rendering steps), preventing independent reproduction and verification of the reported gains.
Authors: We acknowledge that the original manuscript lacks sufficient technical detail for full reproducibility. The revised version will expand the Methods section to specify: (i) the precise architectural modifications to AlexNet, (ii) the dropout probabilities and layer placements for each of the three configurations, and (iii) the complete data-generation pipeline, including the noise model, source galaxy properties, and image rendering steps. revision: yes
-
Referee: No baseline comparisons to traditional lens-modeling codes or to other machine-learning approaches are included, making it difficult to quantify the practical advantage of the dropout-enhanced CNN over existing methods.
Authors: The manuscript’s primary objective is to quantify the isolated contribution of dropout regularization rather than to perform a comprehensive benchmark. We will nevertheless add a concise discussion section that places our R² and error-reduction figures in the context of previously published results from both traditional lens-modeling codes and other CNN-based approaches, while explicitly noting that a dedicated comparative study lies outside the present scope. revision: partial
- We cannot supply transfer tests on real observed lenses or non-SIE profiles within the current study, as all experiments were performed exclusively on the simulated SIE dataset described in the manuscript.
Circularity Check
No significant circularity; empirical ML metrics on held-out synthetic data
full rationale
The paper trains a modified AlexNet CNN on 76,396 synthetic SIE lens images generated from the CSST catalog and reports R², PSNR, and relative error metrics obtained via 4-fold cross-validation on held-out folds. These quantities are computed directly from the network's predictions versus ground-truth SIE parameters on validation data; they are not defined in terms of the fitted weights or inputs by construction, nor do they reduce to any self-referential loop. No equations, uniqueness theorems, or ansatzes are invoked that would make the reported performance gains tautological. The only potential concern is the external validity of the synthetic ensemble (noted in the skeptic headline), but that is a question of generalization, not circularity in the derivation chain. The central claims remain independent empirical measurements.
Axiom & Free-Parameter Ledger
free parameters (1)
- dropout probabilities
axioms (1)
- domain assumption The Singular Isothermal Ellipsoid profile is an adequate description of the mass distribution in the simulated galaxy-galaxy lens systems.
Reference graph
Works this paper leans on
-
[1]
Under this prescription, the projected density profile and the corresponding surface mass density (Σ(⃗ξ)) become ρ(⃗ r) = σ2 v 2πGr2 ,(6) Σ(⃗ξ) = σ2 v 2G √fp ξ2 1 +f 2ξ2 2 .(7) In this context,frefers to the axis ratio of the ellipses, specified within the interval 0< f≤1. The SIE model profile is fundamentally defined by the line-sight velocity dispersio...
work page 2012
-
[2]
Regarding image reconstruction performance, Figure 7 reports the PSNR distributions, with median values of 36.9 dB, 36.8 dB, and 29.2 dB for models 1, 2, and 3, respectively. When dropout is incorporated into the models, these values place the reconstructed im- 16 0 25 50 75 100 pixels 0 25 50 75 100pixels True 0 25 50 75 100 pixels 0 25 50 75 100pixels P...
-
[3]
P. Schneider, C. Kochanek, and J. Wambsganss,Gravitational lensing: strong, weak and micro: Saas-Fee advanced course 33, Vol. 33 (Springer Science & Business Media, 2006)
work page 2006
-
[4]
Schneider,Extragalactic Astronomy and Cosmology: An Introduction(Springer Berlin Hei- delberg, 2015)
P. Schneider,Extragalactic Astronomy and Cosmology: An Introduction(Springer Berlin Hei- delberg, 2015)
work page 2015
-
[5]
Y. D. Hezaveh, N. Dalal, D. P. Marrone, Y.-Y. Mao, W. Morningstar, D. Wen, R. D. Bland- ford, J. E. Carlstrom, C. D. Fassnacht, G. P. Holder, A. Kemball, P. J. Marshall, N. Murray, L. P. Levasseur, J. D. Vieira, and R. H. Wechsler, The Astrophysical Journal823, 37 (2016)
work page 2016
-
[6]
Meneghetti,Introduction to gravitational lensing: with Python examples, Vol
M. Meneghetti,Introduction to gravitational lensing: with Python examples, Vol. 956 (Springer Nature, 2021)
work page 2021
-
[7]
A. J. Shajib, G. Vernardos, T. E. Collett, V. Motta, D. Sluse, L. L. R. Williams, P. Saha, S. Birrer, C. Spiniello, and T. Treu, Space Science Reviews220, 87 (2024)
work page 2024
-
[8]
Treu, Annual Review of Astronomy and Astrophysics48, 87125 (2010)
T. Treu, Annual Review of Astronomy and Astrophysics48, 87125 (2010). 20
work page 2010
-
[9]
P. Natarajan, L. L. R. Williams, M. Bradaˇ c, C. Grillo, A. Ghosh, K. Sharon, and J. Wagner, Space Science Reviews220, 19 (2024)
work page 2024
- [10]
-
[11]
Chae, Monthly Notices of the Royal Astronomical Society346, 746 (2003)
K.-H. Chae, Monthly Notices of the Royal Astronomical Society346, 746 (2003)
work page 2003
- [12]
-
[13]
E. Collaboration, Y. Mellier, and et. al., Astronomy and Astrophysics697, A1 (2025)
work page 2025
-
[14]
Euclid Definition Study Report
R. a. a. Laureijs, “Euclid Definition Study Report,” (2011), arXiv:1110.3193 [astro-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2011
- [15]
- [16]
-
[17]
J. P. Gardner, J. C. Mather, M. Clampin, R. Doyon, M. A. Greenhouse, H. B. Hammel, J. B. Hutchings, P. Jakobsen, S. J. Lilly, K. S. Long, J. I. Lunine, M. J. Mccaughrean, M. Mountain, J. Nella, G. H. Rieke, M. J. Rieke, H.-W. Rix, E. P. Smith, G. Sonneborn, M. Stiavelli, H. S. Stockman, R. A. Windhorst, and G. S. Wright, Space Science Reviews123, 485 (2006)
work page 2006
-
[18]
F. B. e. a. Bianco, The Astrophysical Journal Supplement Series258, 1 (2021)
work page 2021
-
[19]
Hook, inScience with the VLT in the ELT Era, edited by A
I. Hook, inScience with the VLT in the ELT Era, edited by A. Moorwood (Springer Nether- lands, Dordrecht, 2009) pp. 225–232
work page 2009
-
[20]
E. e. a. Palle, Experimental Astronomy59, 29 (2025)
work page 2025
-
[21]
Y. Gong and et. al. (CSST Collaboration), Science China Physics, Mechanics & Astronomy 69, 239501 (2026)
work page 2026
-
[22]
T. E. Collett, The Astrophysical Journal811, 20 (2015)
work page 2015
-
[23]
S. T. Myers, N. J. Jackson, I. W. A. Browne, A. G. de Bruyn, T. J. Pearson, A. C. S. Readhead, P. N. Wilkinson, A. D. Biggs, R. D. Blandford, C. D. Fassnacht, L. V. E. Koopmans, D. R. Marlow, J. P. McKean, M. A. Norbury, P. M. Phillips, D. Rusin, M. C. Shepherd, and C. M. Sykes, Monthly Notices of the Royal Astronomical Society341, 1 (2003)
work page 2003
-
[24]
I. W. A. Browne, P. N. Wilkinson, N. J. F. Jackson, S. T. Myers, C. D. Fassnacht, L. V. E. Koopmans, D. R. Marlow, M. Norbury, D. Rusin, C. M. Sykes, A. D. Biggs, R. D. Blandford, A. G. de Bruyn, K.-H. Chae, P. Helbig, L. J. King, J. P. McKean, T. J. Pearson, P. M. Phillips, A. C. S. Readhead, E. Xanthopoulos, and T. York, Monthly Notices of the Royal Ast...
work page 2003
-
[25]
N. Scoville, H. Aussel, M. Brusa, P. Capak, C. M. Carollo, M. Elvis, M. Giavalisco, L. Guzzo, G. Hasinger, C. Impey, J.-P. Kneib, O. LeFevre, S. J. Lilly, B. Mobasher, A. Renzini, R. M. Rich, D. B. Sanders, E. Schinnerer, D. Schminovich, P. Shopbell, Y. Taniguchi, and N. D. Tyson, The Astrophysical Journal Supplement Series172, 1 (2007)
work page 2007
-
[26]
A. S. Bolton, S. Burles, L. V. E. Koopmans, T. Treu, and L. A. Moustakas, The Astrophysical Journal638, 703 (2006)
work page 2006
-
[27]
A. S. Bolton, S. Burles, L. V. E. Koopmans, T. Treu, R. Gavazzi, L. A. Moustakas, R. Wayth, and D. J. Schlegel, The Astrophysical Journal682, 964 (2008)
work page 2008
- [28]
- [29]
-
[30]
Spectral classification using convolutional neural networks
P. H´ ala, arXiv e-prints , arXiv:1412.8341 (2014), arXiv:1412.8341 [cs.CV]
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[31]
C. E. Petrillo, C. Tortora, S. Chatterjee, G. Vernardos, L. V. E. Koopmans, G. Verdoes Kleijn, N. R. Napolitano, G. Covone, P. Schneider, A. Grado, and J. McFarland, Monthly Notices 21 of the Royal Astronomical Society472, 1129 (2017)
work page 2017
-
[32]
Y. D. Hezaveh, L. P. Levasseur, and P. J. Marshall, Nature548, 555 (2017)
work page 2017
-
[33]
S. Schuldt, S. H. Suyu, T. Meinhardt, L. Leal-Taix´ e, R. Ca˜ nameras, S. Taubenberger, and A. Halkola, Astronomy & Astrophysics646, A126 (2021)
work page 2021
-
[34]
W. R. Morningstar, L. P. Levasseur, Y. D. Hezaveh, R. Blandford, P. Marshall, P. Putzky, T. D. Rueter, R. Wechsler, and M. Welling, The Astrophysical Journal883, 14 (2019)
work page 2019
-
[35]
J. W. Park, S. Wagner-Carena, S. Birrer, P. J. Marshall, J. Y.-Y. Lin, A. Roodman, and T. L. D. E. S. Collaboration), The Astrophysical Journal910, 39 (2021)
work page 2021
-
[36]
J. Pearson, J. Maresca, N. Li, and S. Dye, Monthly Notices of the Royal Astronomical Society 505, 4362 (2021)
work page 2021
-
[37]
V. Busillo and et. al., Astronomy & Astrophysics (2026), 10.1051/0004-6361/202554538, arXiv:2503.15329 [astro-ph.CO]
-
[38]
R. Parlange, J. C. Cuevas-Tello, O. Valenzuela, O. d. J. Cabrera-Rosas, T. Verdugo, A. More, and A. T. Jaelani, Monthly Notices of the Royal Astronomical Society545, staf1747 (2025), https://academic.oup.com/mnras/article-pdf/545/2/staf1747/64612524/staf1747.pdf
work page 2025
-
[39]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, J. Mach. Learn. Res.15, 1929 (2014)
work page 1929
- [40]
-
[41]
R. Kormann, P. Schneider, and M. Bartelmann, Astronomy and Astrophysics284, 285 (1994)
work page 1994
- [42]
- [43]
-
[44]
A. Etherington, J. W. Nightingale, R. Massey, X. Cao, A. Robertson, N. C. Amorisco, A. Amvrosiadis, S. Cole, C. S. Frenk, Q. He, R. Li, and S.-I. Tam, Monthly Notices of the Royal Astronomical Society517, 3275 (2022)
work page 2022
-
[45]
J. L. S´ ersic, Boletin de la Asociacion Argentina de Astronomia La Plata Argentina6, 41 (1963)
work page 1963
-
[46]
J. L. Sersic, Cordoba (1968)
work page 1968
-
[47]
Analytical properties of the R^(1/m) luminosity law
L. Ciotti and G. Bertin, “Analytical properties of ther (1/m) luminosity law,” (1999), arXiv:astro-ph/9911078 [astro-ph]
work page internal anchor Pith review Pith/arXiv arXiv 1999
-
[48]
V. F. Cardone, Astronomy & Astrophysics415, 839 (2004)
work page 2004
-
[49]
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, IEEE Transactions on Neural Networks and Learning Systems33, 6999 (2022)
work page 2022
-
[50]
L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamar´ ıa, M. A. Fadhel, M. Al-Amidie, and L. Farhan, Journal of Big Data8, 53 (2021)
work page 2021
-
[51]
V. Nair and G. E. Hinton, inProceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10 (Omnipress, Madison, WI, USA, 2010) pp. 807– 814
work page 2010
- [52]
- [53]
-
[54]
P. Baldi and P. J. Sadowski, inAdvances in Neural Information Processing Systems, Vol. 26 (Curran Associates, Inc., 2013). 22
work page 2013
-
[55]
Improving neural networks by preventing co-adaptation of feature detectors
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” (2012), arXiv:1207.0580 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[56]
I. Salehin and D.-K. Kang, Electronics12(2023), 10.3390/electronics12143106
-
[57]
X. Cao, R. Li, N. Li, R. Li, Y. Chen, K. Ding, H. Shan, H. Zhan, X. Zhang, W. Du, and S. Cao, Monthly Notices of the Royal Astronomical Society533, 1960 (2024), https://academic.oup.com/mnras/article-pdf/533/2/1960/58913741/stae1865.pdf
work page 1960
- [58]
-
[59]
S. Mukherjee, L. V. E. Koopmans, R. B. Metcalf, N. Tessore, C. Tortora, M. Schaller, J. Schaye, R. A. Crain, G. Vernardos, F. Bellagamba, and T. Theuns, Monthly Notices of the Royal Astronomical Society479, 4108 (2018), https://academic.oup.com/mnras/article- pdf/479/3/4108/25170031/sty1741.pdf
work page 2018
-
[60]
M. Stone, Journal of the Royal Statistical Society: Series B (Methodological)36, 111 (2018), https://academic.oup.com/jrsssb/article-pdf/36/2/111/49096683/jrsssb 36 2 111.pdf
work page 2018
-
[61]
Tfrecord and tf.train.example,
TensorFlow Authors, “Tfrecord and tf.train.example,”https://www.tensorflow.org/ tutorials/load_data/tfrecord(2024)
work page 2024
-
[62]
P. Ballester and R. Araujo, Proceedings of the AAAI Conference on Artificial Intelligence30 (2016), 10.1609/aaai.v30i1.10171
-
[63]
X. Zhang, in2021 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE)(2021) pp. 414–419
work page 2021
-
[64]
L. Perreault Levasseur, Y. D. Hezaveh, and R. H. Wechsler, The Astrophysical Journal Letters 850, L7 (2017)
work page 2017
-
[65]
T. Dozat, inProceedings of the 4th International Conference on Learning Representations, Workshop Track(San Juan, Puerto Rico, 2016) pp. 1–4
work page 2016
-
[66]
I. Sutskever, J. Martens, G. Dahl, and G. Hinton, inProceedings of the 30th International Conference on Machine Learning(PMLR, 2013) pp. 1139–1147
work page 2013
-
[67]
Cholletet al., “Keras,”https://keras.io(2015)
F. Cholletet al., “Keras,”https://keras.io(2015)
work page 2015
-
[68]
I. Goodfellow, Y. Bengio, and A. Courville,Deep Learning(MIT Press, 2016)http://www. deeplearningbook.org
work page 2016
- [69]
-
[70]
The elements of statistical learning: data mining, inference, and prediction,
D. Ruppert, “The elements of statistical learning: data mining, inference, and prediction,” (2004)
work page 2004
-
[71]
Botchkarev, Interdisciplinary Journal of Information, Knowledge, and Management14, 045076 (2019)
A. Botchkarev, Interdisciplinary Journal of Information, Knowledge, and Management14, 045076 (2019)
work page 2019
- [72]
-
[73]
R. J. Hyndman and A. B. Koehler, International Journal of Forecasting22, 679 (2006)
work page 2006
-
[74]
P. J. Huber, “Robust statistics,” inInternational Encyclopedia of Statistical Science, edited by M. Lovric (Springer Berlin Heidelberg, Berlin, Heidelberg, 2011) pp. 1248–1251
work page 2011
-
[75]
D. Sadykova and A. P. James, in2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)(2017) pp. 2366–2369
work page 2017
-
[76]
V. Petrovic, B. Pavlovi, M. Andri, and B. Bondzulic, Electronics Letters52(2016), 10.1049/el.2015.3784
- [77]
-
[78]
W. R. Morningstar, Y. D. Hezaveh, L. P. Levasseur, R. D. Blandford, P. J. Marshall, P. Putzky, and R. H. Wechsler, “Analyzing interferometric observations of strong gravitational lenses with 23 recurrent and convolutional neural networks,” (2018), arXiv:1808.00011 [astro-ph.IM]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[79]
F. Gentile, C. Tortora, G. Covone, L. V. E. Koopmans, R. Li, L. Leuzzi, and N. R. Napolitano, Monthly Notices of the Royal Astronomical Society522, 54425455 (2023)
work page 2023
-
[80]
J. Binney and S. Tremaine,Galactic Dynamics, Princeton Series in Astrophysics (Princeton University Press, 1987)
work page 1987
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.