arxiv: 2604.03691 · v1 · submitted 2026-04-04 · 🌌 astro-ph.GA · astro-ph.IM

Recognition: no theorem link

LensAgent: A Self Evolving Agent for Autonomous Physical Inference of Sub-galactic Structure

Jean-Paul Kneib, Philip Torr, Xiaotang Feng, Zihan Wang, Zilang Shu

Authors on Pith no claims yet

Pith reviewed 2026-05-13 17:07 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM

keywords strong gravitational lensingdark matter substructuresLLM agentmass reconstructionautonomous inferencesub-galactic scalesSLACS systemstraining-free framework

0 comments

The pith

An LLM-driven agent autonomously reconstructs mass distributions in strong lensing systems to extract sub-galactic dark matter structures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LensAgent as a training-free framework where a large language model acts as an autonomous scientific agent to infer mass distributions from strong gravitational lensing data. It combines logical reasoning steps with deterministic physical modeling tools to achieve reconstructions on SLACS Grade A systems. This setup targets the scalability limits of manual modeling and the mass-sheet degeneracy that block cosmological use of lensing. A sympathetic reader would care because the method promises to process sub-galactic structures across the volume of upcoming wide-field surveys.

Core claim

LensAgent is a pioneering training-free, large language model driven agentic framework for the autonomous physical inference of mass distributions. Operating as an autonomous scientific agent, it couples high-level logical reasoning with deterministic physical modeling tools, demonstrating successful reconstruction of mass distribution in SLACS Grade A strong lensing systems. This self-evolving architecture enables the robust extraction of sub-galactic substructures at scale.

What carries the argument

The self-evolving agentic framework that couples high-level logical reasoning with deterministic physical modeling tools to perform autonomous mass inference.

If this is right

Mass reconstructions become feasible without manual tuning or neural network training for individual systems.
Sub-galactic substructures can be extracted robustly from strong lensing data at the scale of large surveys.
The mass-sheet degeneracy is mitigated through the agent's iterative physical tool use.
Cosmological constraints on dark matter become accessible from wide-field datasets such as those from LSST and Euclid.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same agent structure could be tested on simulated lensing systems with known ground-truth substructures to quantify error rates.
Extending the tool set to include additional physical priors might further constrain solutions in complex lens configurations.
Deployment on survey-scale catalogs would require validation that the self-evolving loop maintains consistency across thousands of independent systems.

Load-bearing premise

A large language model can couple logical reasoning with physical modeling tools to generate accurate mass reconstructions without producing hallucinations or unphysical solutions.

What would settle it

Direct comparison of LensAgent mass maps for SLACS Grade A systems against independent reconstructions from traditional modeling pipelines that reveals systematic mismatches in substructure properties or total mass would falsify the claim of successful autonomous inference.

read the original abstract

Probing dark matter distribution on sub-galactic scales is essential for testing the Cold Dark Matter ($\Lambda$CDM) paradigm. Strong gravitational lensing, as one of the most powerful approach by far, provides a direct, purely gravitational probe of these substructures. However, extracting cosmological constraints is severely bottlenecked by the mass-sheet degeneracy (MSD) and the unscalable nature of manual and neural-network modeling. Here, we introduce LensAgent, a pioneering training-free, large language model (LLM)-driven agentic framework for the autonomous physical inference of mass distributions. Operating as an autonomous scientific agent, LensAgent couples high-level logical reasoning with deterministic physical modeling tools, demonstarting successful reconstruction of mass distribution in SLACS Grade A strong lensing systems. This self-evolving architecture enables the robust extraction of sub-galactic substructures at scale, unlocking the cosmological potential of upcoming wide-field surveys such as the Rubin Observatory (LSST) and Euclid.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LensAgent is a new LLM-agent idea for automating lensing mass models, but the paper gives no numbers or checks to show it actually works.

read the letter

The main thing to know is that this paper introduces LensAgent, an LLM-driven agent that links high-level reasoning steps with existing deterministic lensing tools in a self-evolving loop, and it claims this produces accurate mass reconstructions for SLACS Grade A systems without any training. That agent setup is the genuinely new piece compared with standard neural-net lensing work. The motivation section also does a clear job laying out why sub-galactic structure constraints matter for Lambda CDM tests and why current manual or supervised methods hit scaling walls with upcoming surveys. The training-free angle is a reasonable attempt to sidestep data-hungry alternatives. The soft spot is straightforward and central: the abstract states successful reconstructions but supplies zero quantitative support. No chi-squared values, no recovered parameters, no comparison against lenstronomy or PyAutoLens outputs, no simulated ground-truth tests, and no explicit handling of the mass-sheet degeneracy. Without those checks it is impossible to tell whether the agent avoids unphysical solutions or simply produces coherent text. The citation list is light on recent agentic-AI applications in physics, but that is secondary. This is aimed at people already running lensing pipelines who want to explore automation for large samples. A reading group could usefully discuss the agent loop itself. I would not cite the work yet because there are no results to rely on. It deserves peer review once the authors add proper validation benchmarks, since the underlying problem is real and the framing is honest even if the evidence is missing.

Referee Report

1 major / 1 minor

Summary. The paper introduces LensAgent, a training-free LLM-driven agentic framework that couples high-level logical reasoning with deterministic physical modeling tools for autonomous inference of mass distributions in strong gravitational lensing systems. It claims successful reconstruction of mass distributions on SLACS Grade A systems, enabling scalable extraction of sub-galactic substructures to support cosmological analyses from surveys such as LSST and Euclid.

Significance. If rigorously validated, the agentic approach could address key scalability bottlenecks in strong-lensing modeling by reducing reliance on manual or supervised neural-network methods. However, the current manuscript provides no quantitative evidence of reconstruction fidelity, limiting assessment of whether the framework genuinely advances the field beyond existing tools.

major comments (1)

[Abstract] Abstract: The central claim of 'successful reconstruction of mass distribution in SLACS Grade A strong lensing systems' is unsupported by any reported metrics (e.g., reduced chi-squared, Einstein-radius recovery fractions, subhalo mass accuracy, or direct comparisons against lenstronomy/PyAutoLens baselines). Without these, the assertion that the self-evolving agent produces accurate, unphysical-artifact-free solutions cannot be evaluated and is load-bearing for the paper's contribution.

minor comments (1)

[Abstract] Abstract: Typographical error 'demonstarting' should read 'demonstrating'.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We agree that the abstract claim requires quantitative support and have made revisions accordingly to address this concern.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of 'successful reconstruction of mass distribution in SLACS Grade A strong lensing systems' is unsupported by any reported metrics (e.g., reduced chi-squared, Einstein-radius recovery fractions, subhalo mass accuracy, or direct comparisons against lenstronomy/PyAutoLens baselines). Without these, the assertion that the self-evolving agent produces accurate, unphysical-artifact-free solutions cannot be evaluated and is load-bearing for the paper's contribution.

Authors: We agree with the referee that quantitative metrics are essential to substantiate the central claim. The original manuscript presented the LensAgent framework through autonomous case studies on SLACS Grade A systems with visual demonstrations of mass reconstructions, but did not report explicit numerical metrics or baseline comparisons. In the revised manuscript we will add a new quantitative results subsection that includes reduced chi-squared values for the reconstructed models, Einstein-radius recovery fractions, subhalo mass accuracy estimates, and direct side-by-side comparisons against lenstronomy and PyAutoLens on the same systems. These will be presented in tables with accompanying discussion to demonstrate that the agent-derived solutions achieve comparable or superior fidelity while remaining free of unphysical artifacts. This addition will allow readers to rigorously evaluate the framework's performance. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain; framework is tool-coupled without self-referential predictions

full rationale

The paper presents LensAgent as a training-free, LLM-driven agentic framework that couples high-level reasoning with deterministic physical modeling tools to reconstruct mass distributions in SLACS Grade A systems. No mathematical derivation chain, equations, fitted parameters renamed as predictions, or self-citation load-bearing steps appear in the abstract or described content. The central claim rests on empirical demonstration of the agent architecture rather than any first-principles result that reduces to its own inputs by construction. No patterns of self-definitional loops, uniqueness theorems imported from prior author work, or ansatz smuggling are present, making the work self-contained against external benchmarks as a methods contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is based solely on the abstract; the central claim rests on the unverified assumption that LLM reasoning plus deterministic tools yields physically valid mass models.

axioms (1)

domain assumption Strong gravitational lensing provides a direct, purely gravitational probe of sub-galactic mass distributions
Stated as foundational in the abstract.

pith-pipeline@v0.9.0 · 5479 in / 1264 out tokens · 44297 ms · 2026-05-13T17:07:39.816270+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 3 internal anchors

[1]

Springel, V.et al.Simulating the joint evolution of quasars, galaxies and their large-scale distribution.Nature435, 629–636 (2005)

work page 2005
[2]

Springel, V.et al.The Aquarius Project: the subhalos of galactic halos.Mon. Not. Roy. Astron. Soc.391, 1685–1711 (2008)

work page 2008
[3]

Aghanim, N.et al.Planck 2018 results. VI. Cosmological parameters.Astron. Astrophys.641, A6 (2020). [Erratum: Astron.Astrophys. 652, C4 (2021)]

work page 2018
[4]

Bullock, J. S. & Boylan-Kolchin, M. Small-Scale Challenges to the ΛCDM Paradigm.Ann. Rev. Astron. Astrophys.55, 343–387 (2017)

work page 2017
[5]

& Vegetti, S

Despali, G. & Vegetti, S. The impact of baryonic physics on the subhalo mass function and implications for gravitational lensing.Mon. Not. Roy. Astron. Soc. 469, 1997–2010 (2017)

work page 1997
[6]

J.et al.Is every strong lens model unhappy in its own way? Uniform modelling of a sample of 13 quadruply+ imaged quasars.Mon

Shajib, A. J.et al.Is every strong lens model unhappy in its own way? Uniform modelling of a sample of 13 quadruply+ imaged quasars.Mon. Not. Roy. Astron. Soc.483, 5649–5671 (2019)

work page 2019
[7]

& Sluse, D

Schneider, P. & Sluse, D. Mass-sheet degeneracy, power-law models and exter- nal convergence: Impact on the determination of the Hubble constant from gravitational lensing.Astron. Astrophys.559, A37 (2013)

work page 2013
[8]

J.873, 111 (2019)

Ivezi´ c,ˇZ.et al.LSST: from Science Drivers to Reference Design and Anticipated Data Products.Astrophys. J.873, 111 (2019)

work page 2019
[9]

Scaramella, R.et al.Euclid preparation. I. The Euclid Wide Survey.Astron. Astrophys.662, A112 (2022). 8

work page 2022
[10]

J.et al.dolphin: A Fully Automated Forward-modeling Pipeline Pow- ered by Artificial Intelligence for Galaxy-scale Strong Lenses.Astrophys

Shajib, A. J.et al.dolphin: A Fully Automated Forward-modeling Pipeline Pow- ered by Artificial Intelligence for Galaxy-scale Strong Lenses.Astrophys. J.992, 40 (2025)

work page 2025
[11]

Zhang, G., S ¸eng¨ ul, A. C ¸ . & Dvorkin, C. Subhalo effective density slope mea- surements from HST strong lensing data with neural likelihood-ratio estimation. Mon. Not. Roy. Astron. Soc.527, 4183–4192 (2023)

work page 2023
[12]

GPT-4 Technical Report

OpenAI. GPT-4 technical report.arXiv(2023). URL https://arxiv.org/abs/ 2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023
[13]

B.et al.Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M

Brown, T. B.et al.Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. & Lin, H. (eds)Language models are few-shot learners. (eds Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. & Lin, H.)Advances in Neural Information Process- ing Systems 33: Annual Conference on Neural Information Processing Systems 2020, Vol. 33, 1877–1901 (2020). URL https://proc...

work page 2020
[14]

Sadler and Jiaman Wu and Wei

Sur´ ıs, D., Menon, S. & Vondrick, C. Vipergpt: Visual inference via python execution for reasoning.International Conference on Computer Vision, ICCV 11854–11864 (2023). URL https://doi.org/10.1109/ICCV51070.2023.01092

work page doi:10.1109/iccv51070.2023.01092 2023
[15]

URL https://doi.org/10.1038/s44387-025-00019-5

Zhang, Y.et al.Exploring the role of large language models in the scientific method: from hypothesis to discovery.npj Artificial Intelligence1(2025). URL https://doi.org/10.1038/s44387-025-00019-5

work page doi:10.1038/s44387-025-00019-5 2025
[16]

International Conference on Learning Representations (ICLR)(2023)

Yao, S.et al.ReAct: Synergizing reasoning and acting in language models. International Conference on Learning Representations (ICLR)(2023). URL https://openreview.net/forum?id=WE vluYUL-X

work page 2023
[17]

(eds Oh, A.et al.)Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, Vol

Schick, T.et al.Oh, A.et al.(eds)Toolformer: Language models can teach themselves to use tools. (eds Oh, A.et al.)Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, Vol. 36, 68539–68551 (Curran Asso- ciates, 2023). URL https://proceedings.neurips.cc/paper files/paper/2023/file/ d842425e4b...

work page 2023
[18]

& Yao, S

Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. & Yao, S. Oh, A.et al. (eds)Reflexion: language agents with verbal reinforcement learning. (eds Oh, A. et al.)Advances in Neural Information Processing Systems 36: Annual Confer- ence on Neural Information Processing Systems 2023, Vol. 36, 8634–8652 (Curran Associates, 2023). URL https://proceedings.neu...

work page 2023
[19]

(eds Oh, A.et al.)Advances in Neural 9 Information Processing Systems 36: Annual Conference on Neural Infor- mation Processing Systems 2023, Vol

Madaan, A.et al.Oh, A.et al.(eds)Self-refine: Iterative refine- ment with self-feedback. (eds Oh, A.et al.)Advances in Neural 9 Information Processing Systems 36: Annual Conference on Neural Infor- mation Processing Systems 2023, Vol. 36, 46534–46594 (Curran Asso- ciates, 2023). URL https://proceedings.neurips.cc/paper files/paper/2023/file/ 91edff07232fb...

work page 2023
[20]

URL https://doi.org/10

Romera-Paredes, B.et al.Mathematical discoveries from program search with large language models.Nature625, 468–475 (2024). URL https://doi.org/10. 1038/s41586-023-06924-6

work page 2024
[21]

Meyerson, E.et al.Language model crossover: Variation through few-shot prompting.ACM Trans. Evol. Learn. Optim.4, 27:1–27:40 (2024). URL https://doi.org/10.1145/3694791

work page doi:10.1145/3694791 2024
[23]

& Fischer, P

Auer, P., Cesa-Bianchi, N. & Fischer, P. Finite-time analysis of the multiarmed bandit problem.Mach. Learn.47, 235–256 (2002). URL https://doi.org/10.1023/ A:1013689704352

work page 2002
[24]

Koopmans, L. V. E., Treu, T., Bolton, A. S., Burles, S. & Moustakas, L. A. The sloan lens acs survey. 3. the structure and formation of early-type galaxies and their evolution since z˜1.Astrophys. J.649, 599–615 (2006)

work page 2006
[25]

R.et al.The properties of warm dark matter haloes.Mon

Lovell, M. R.et al.The properties of warm dark matter haloes.Mon. Not. Roy. Astron. Soc.439, 300–317 (2014)

work page 2014
[26]

Scalar-Mediated Inelastic Dark Matter as a Solution to Small-Scale Structure Anomalies

Wang, Z. Scalar-Mediated Inelastic Dark Matter as a Solution to Small-Scale Structure Anomalies.arXiv e-printsarXiv:2512.18959 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[27]

Xu, D.et al.How well can cold dark matter substructures account for the observed radio flux-ratio anomalies.Mon. Not. Roy. Astron. Soc.447, 3189–3206 (2015)

work page 2015
[28]

Collett, T. E. The population of galaxy-galaxy strong lenses in forthcoming optical imaging surveys.Astrophys. J.811, 20 (2015)

work page 2015
[29]

& Cirasuolo, M

Padovani, P. & Cirasuolo, M. The Extremely Large Telescope.Contemp. Phys. 64, 47–64 (2023)

work page 2023
[30]

beyond code snippets: Benchmarking llms on repository-level question answering

Bradley, L. astropy/photutils: Version 1.8.0. https://doi.org/10.5281/zenodo. 7946442 (2023). Zenodo

work page doi:10.5281/zenodo 2023
[31]

Virtanen, P.et al.SciPy 1.0: fundamental algorithms for scientific computing in Python.Nature Medicine17, 261–272 (2020)

work page 2020
[32]

E., Hook, R

Krist, J. E., Hook, R. N. & Stoehr, F. Kahan, M. A. (ed.)20 years of Hubble Space Telescope optical modeling using Tiny Tim. (ed.Kahan, M. A.)Optical 10 Modeling and Performance Predictions V, Vol. 8127 ofSociety of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, 81270J (2011)

work page 2011
[33]

& Amara, A

Birrer, S. & Amara, A. lenstronomy: Multi-purpose gravitational lens modelling software package.Physics of the Dark Universe22, 189–201 (2018). URL https: //www.sciencedirect.com/science/article/pii/S2212686418301869

work page 2018
[34]

Shajib, A. J. Unified lensing and kinematic analysis for any elliptical mass profile. Mon. Not. Roy. Astron. Soc.488, 1387–1400 (2019)

work page 2019
[35]

Structure and Kinematics of Early-Type Galaxies from Integral Field Spectroscopy.Ann

Cappellari, M. Structure and Kinematics of Early-Type Galaxies from Integral Field Spectroscopy.Ann. Rev. Astron. Astrophys.54, 597–665 (2016)

work page 2016
[36]

Mamon, G. A. & Lokas, E. L. Dark matter in elliptical galaxies - I. Is the total mass density profile of the NFW form or even steeper?Mon. Not. Roy. Astron. Soc.362, 95–109 (2005)

work page 2005
[37]

An analytical model for spherical galaxies and bulges.Astrophysical Journal; (USA)356(1990)

Hernquist, L. An analytical model for spherical galaxies and bulges.Astrophysical Journal; (USA)356(1990). URL https://www.osti.gov/biblio/6696799

work page arXiv 1990
[38]

& Lee, Y

Liu, H., Li, C., Wu, Q. & Lee, Y. J. Oh, A.et al.(eds) Visual instruction tuning. (eds Oh, A.et al.)Advances in Neural Information Processing Systems 36: Annual Conference on Neural Infor- mation Processing Systems 2023, Vol. 36, 34892–34916 (Curran Asso- ciates, 2023). URL https://proceedings.neurips.cc/paper files/paper/2023/file/ 6dcf277ea32ce3288914fa...

work page 2023
[39]

Acero, Z

Yang, Z.et al.MM-REACT: prompting chatgpt for multimodal reasoning and action.arxivabs/2303.11381(2023). URL https://doi.org/10.48550/arXiv. 2303.11381

work page internal anchor Pith review doi:10.48550/arxiv 2023
[40]

Illuminating search spaces by mapping elites

Mouret, J. & Clune, J. Illuminating search spaces by mapping elites.arxiv abs/1504.04909(2015). URL http://arxiv.org/abs/1504.04909

work page Pith review arXiv 2015
[41]

K., Soros, L

Pugh, J. K., Soros, L. B. & Stanley, K. O. Quality diversity: A new frontier for evolutionary computation.Frontiers Robotics AI3, 40 (2016). URL https:// www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2016.00040

work page doi:10.3389/frobt.2016.00040 2016
[42]

& Rockt¨ aschel, T

Fernando, C., Banarse, D., Michalewski, H., Osindero, S. & Rockt¨ aschel, T. Salakhutdinov, R.et al.(eds)Promptbreeder: Self-referential self-improvement via prompt evolution. (eds Salakhutdinov, R.et al.)Forty-first International Conference on Machine Learning, ICML 2024, Proceedings of Machine Learn- ing Research, 13481–13544 (2024). URL https://proceed...

work page 2024
[43]

URL https://openreview.net/forum? 11 id=Bb4VGOWELI

Yang, C.et al.Large language models as optimizers.International Conference on Learning Representations (ICLR)(2024). URL https://openreview.net/forum? 11 id=Bb4VGOWELI

work page 2024
[44]

S.et al.The Sloan Lens ACS Survey

Bolton, A. S.et al.The Sloan Lens ACS Survey. V. The Full ACS Strong-Lens Sample.Astrophys. J.682, 964–984 (2008). Methods Data The lensed sample catalog is selected from SLACS survey.We have selected 20 sam- ples that are marked as grade A, which are confirmed to be strong lensing samples.We then obtain the image of these samples observed by HST,particul...

work page 2008