TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research

Fan Meng; Haoluo Zhao; Hongchun Zhang; Jing-Jia Luo; Kaikai Zhang; Mengyang Yu; Nan Chen; Nan Li; Tao Song

arxiv: 2606.07697 · v1 · pith:44YXSIZ2new · submitted 2026-06-05 · ⚛️ physics.ao-ph · cs.AI

TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research

Haoluo Zhao , Hongchun Zhang , Nan Li , Jing-Jia Luo , Kaikai Zhang , Mengyang Yu , Nan Chen , Tao Song

show 1 more author

Fan Meng

This is my paper

Pith reviewed 2026-06-27 20:27 UTC · model grok-4.3

classification ⚛️ physics.ao-ph cs.AI

keywords AI ScientistWRF-Chematmospheric chemistrymechanism validationmulti-agent systemozoneparticulate matteraerosol-radiation interaction

0 comments

The pith

TianJi-Environ is the first WRF-Chem multi-agent system that turns mechanistic hypotheses into autonomous atmospheric-chemistry simulations and auditable evidence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TianJi-Environ as an AI system that removes the need for constant expert intervention in validating pollution mechanisms. It uses a multi-agent framework built on WRF-Chem to take a hypothesis, set up the corresponding model runs, execute them, and judge whether the outputs provide complete evidence. Demonstrations on summer ozone over the North China Plain and winter PM2.5 over the Guanzhong Basin show the system identifying consistent signals in some cases and pinpointing missing process links in others. If the approach works, mechanism validation becomes an explicit, repeatable workflow rather than an opaque expert task.

Core claim

TianJi-Environ establishes the first WRF-Chem-based multi-agent framework that autonomously drives complex atmospheric-chemistry simulations, converting mechanistic hypotheses into executable configurations, testing experiments, and evidence criteria. In the ozone case it detects directionally consistent aerosol-radiation-interaction signals yet judges evidence for NOx-control response incomplete; in the PM2.5 case it traces the unsupported link to insufficient black-carbon propagation and absent vertical-heating diagnostics. These results make expert-driven mechanism validation explicit, structured, and auditable.

What carries the argument

The WRF-Chem-based multi-agent framework that operationalizes hypotheses into model configurations, runs experiments, and applies evidence criteria.

If this is right

Mechanism validation for ozone response to NOx control can be performed with explicit detection of aerosol-radiation signals alongside an incompleteness judgment.
Particulate-matter feedback studies can localize unsupported links to specific missing propagations such as black-carbon effects on vertical heating.
Atmospheric-chemistry experiments become traceable sequences of hypothesis, configuration, output, and evidence criterion rather than ad-hoc expert runs.
The same multi-agent structure can be applied to other mechanistic questions in WRF-Chem without redesigning the workflow for each new hypothesis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be extended to additional chemical mechanisms or different regional domains once the core agent logic is shown reliable on the two presented cases.
If the system consistently flags evidence gaps, it might reduce the time researchers spend on exhaustive manual diagnostics.
Integration with observational datasets could allow the evidence criteria to include direct comparisons against measurements rather than model-internal diagnostics alone.

Load-bearing premise

The multi-agent system can translate mechanistic hypotheses into correct model settings and judge evidence completeness without omitting key physical processes or introducing systematic judgment errors.

What would settle it

A controlled test case in which a known important physical process is omitted from the hypothesis yet the system still declares the evidence complete.

Figures

Figures reproduced from arXiv: 2606.07697 by Fan Meng, Haoluo Zhao, Hongchun Zhang, Jing-Jia Luo, Kaikai Zhang, Mengyang Yu, Nan Chen, Nan Li, Tao Song.

**Figure 1.** Figure 1: Comparison between the traditional atmospheric-chemistry workflow and the autonomous research loop of TianJi-Environ. The traditional workflow relies on manual coordination among literature reading, hypothesis construction, WRF-Chem experiment preparation, model execution, postprocessing, and interpretation. TianJi-Environ organizes these steps as a research loop spanning literature synthesis, hypothesis… view at source ↗

**Figure 2.** Figure 2: End-to-end multi-agent architecture of TianJi-Environ. A coordination layer organizes an open-ended atmospheric environmental problem across literature synthesis, hypothesis organization, WRF-Chem experiment design, diagnostic evidence, and report expression. Diagnostic results, evidence gaps, and scientific evaluation can be fed back as targeted resurvey objectives for later hypothesis refinement and exp… view at source ↗

**Figure 3.** Figure 3: Evidence-grounded formulation of testable hypotheses for atmospheric environmental mechanism research. Open research questions are first translated into targeted literature-survey objectives around mechanisms, regional conditions, uncertainties, and diagnostic evidence. Literature evidence, case background, mechanistic clues, and future observational constraints are then organized into a traceable eviden… view at source ↗

**Figure 4.** Figure 4: Closed-loop chain from mechanistic hypothesis to WRF-Chem evidence judgement. The system decomposes a mechanism proposition into antecedent conditions, perturbable processes, branch contrasts, diagnostic variables, and evidence criteria. Model execution, diagnostic extraction, evidence gaps, and scientific judgement can therefore be traced back to the causal links specified by the hypothesis. 4 Case Studie… view at source ↗

**Figure 5.** Figure 5: Integrated diagnosis of the H1 branch experiment for summertime ozone response to NO𝑥 reduction over the North China Plain. The four branches separate the effects of an approximately 30% NO𝑥 reduction, ARI activation, and their combination. ARI effects on SWDOWN and PBLH are directionally consistent with the expected aerosol radiative feedback that weakens photochemical and boundarylayer mixing condition… view at source ↗

**Figure 6.** Figure 6: Spatial-response diagnosis for the H1 branch experiment. The figure compares spatial differences in MDA8 O3, SWDOWN, PBLH, and NO2 for ARI, NO𝑥 reduction, and combined perturbations relative to the control branch, and also shows the combined branch relative to the NO𝑥-cut branch. SWDOWN and PBLH show clear but spatially heterogeneous perturbation structures, whereas the MDA8 O3 response is weak and spati… view at source ↗

**Figure 7.** Figure 7: Integrated diagnosis of the H2 branch experiment for the wintertime black-carbon absorbing-feedback hypothesis in the Guanzhong Basin. The 2×2 factorial branches compare the effects of ARI on/off and BC-load perturbation on SWDOWN, PBLH, BC1, and PM2.5. The ARI branch produces a shortwave-radiation reduction signal, but the PM2.5 response is close to zero. The high-BC branch is almost identical to the nor… view at source ↗

**Figure 8.** Figure 8: Daily branch time-series diagnostics for the H2 experiment. a–e, Domain-mean daily trajectories of PM2.5, SWDOWN, PBLH, T2, and BC1 across the four ARI × BC-load branches. Grey denotes no-ARI branches and blue denotes ARI branches; solid circles denote normal-BC branches and open diamonds denote high-BC branches. Coincident normal-BC and high-BC trajectories indicate that the current high-BC perturbation… view at source ↗

**Figure 9.** Figure 9: Research-action reliability and multi-agent workload during the H1 and H2 runs. a, Scale and success rate of tool/API-mediated research actions in the two cases. b, Distribution of Planner routing decisions across target roles. c, Agent-level event counts recorded in the process trace. d, Distribution of tool/API actions across reasoning, design, execution, evidence, and other categories. 5.2 Cross-Stage C… view at source ↗

**Figure 10.** Figure 10: Coordination trajectory across research stages during the H2 run. a, Sequence of 18 Planner routing decisions from experiment design and input readiness to remote execution and evidenceto-report synthesis. b, Planner, Scientist, and Executor activity across trace-event indices. c, Number of routing decisions directed to each target role. d, Event counts by role in the process trace. 5.3 Diagnostic Tasks … view at source ↗

**Figure 11.** Figure 11: Examples from the diagnostic tasks. a, SA-01 MDA8 O3 peak distribution and peak-location identification. b, SA-03 PM2.5 episode diagnosis, showing regional episode selection and pollution-centre identification. c–f, SA-05 O3–meteorology co-variation diagnosis, showing high-ozone-day MDA8 O3, T2, PBLH, and SWDOWN patterns. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

read the original abstract

As atmospheric environmental prediction continues to improve, interpretable validation of pollution mechanisms and feedback processes has become a main challenge in atmospheric chemistry. Yet mechanism validation based on complex numerical models still relies heavily on expert knowledge: mechanistic hypotheses must be operationalized into executable experiments, and model outputs must be organized into traceable evidence. We present TianJi-Environ, an auditable AI Scientist for atmospheric-chemistry mechanism validation. TianJi-Environ establishes the first WRF-Chem-based multi-agent framework that autonomously drives complex atmospheric-chemistry simulations, converting mechanistic hypotheses into executable configurations, testing experiments, and evidence criteria. Using ozone response and particulate-matter feedback as two representative examples, we demonstrate TianJi-Environ's capability for mechanism validation. In a summertime ozone case over the North China Plain, the system detects directionally consistent aerosol-radiation-interaction signals in shortwave radiation and boundary-layer height, but judges the evidence for ozone response to NOx control to be incomplete. In a wintertime PM2.5 case over the Guanzhong Basin, it localizes the unsupported link to insufficient propagation from black-carbon perturbation to particulate response and missing diagnostics of vertical absorptive heating. These results show that TianJi-Environ makes expert-driven mechanism validation explicit, structured, and auditable, offering a reproducible paradigm for multi-agent systems coupled with complex atmospheric-chemistry models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TianJi-Environ is a new multi-agent wrapper around WRF-Chem that turns hypotheses into runs and flags evidence gaps, but the paper gives no independent check on whether the agents get those judgments right.

read the letter

The core contribution is a multi-agent setup that takes a mechanistic hypothesis, builds the corresponding WRF-Chem configuration, runs the experiments, and then produces a structured verdict on whether the evidence is complete. The two worked examples—one on summertime ozone over the North China Plain and one on wintertime PM2.5 over the Guanzhong Basin—show the system surfacing specific shortfalls such as incomplete NOx-control signals or missing vertical-heating diagnostics.

What stands out is the attempt to make the whole chain explicit and auditable rather than leaving it inside an expert’s head. The framework description is clear enough that someone could replicate the agent roles and the evidence criteria.

The main weakness is that the paper never tests whether the agents actually get the physics right. There are no expert cross-checks, no ablation on prompt variations, and no comparison against human-led validation on the same cases. Without that, the claim that the system produces reliable, auditable evidence rests on trusting the agents’ internal assessments.

The work is aimed at atmospheric chemists who already run WRF-Chem and want a structured way to document mechanism tests. A reader looking for a ready-to-use tool will find the current version preliminary; a reader interested in the broader idea of coupling agents to complex models will see a concrete starting point.

I would send it to peer review. Referees can press on the validation gap and decide whether the framework is worth developing further.

Referee Report

3 major / 2 minor

Summary. The manuscript presents TianJi-Environ, a multi-agent AI framework coupled with the WRF-Chem model for autonomous validation of atmospheric chemistry mechanisms. The system is claimed to convert mechanistic hypotheses into executable model configurations, run simulations, and assess the completeness of evidence for processes such as aerosol-radiation interactions affecting ozone and black-carbon perturbations on PM2.5. Two case studies are used to illustrate its application: one on summertime ozone over the North China Plain concluding incomplete evidence for NOx control response, and one on wintertime PM2.5 over the Guanzhong Basin identifying missing propagation from black-carbon and vertical heating diagnostics.

Significance. If the AI system's judgments prove reliable upon verification, this work could offer a reproducible and auditable paradigm for mechanism validation in atmospheric environmental research, reducing dependence on individual expert knowledge. The integration of multi-agent systems with complex numerical models like WRF-Chem represents a novel approach that could enhance the traceability of hypothesis testing in the field. The demonstrations suggest potential for identifying gaps in evidence that might be overlooked in traditional workflows.

major comments (3)

[Abstract (ozone case)] Abstract (ozone case): The judgment that 'the evidence for ozone response to NOx control to be incomplete' is presented without detailing the specific evidence criteria, thresholds, or how the multi-agent system evaluates completeness (e.g., whether aerosol-radiation-interaction signals in shortwave radiation and boundary-layer height are quantified against expected magnitudes). This is load-bearing for the claim of autonomous mechanism validation.
[Abstract (PM2.5 case)] Abstract (PM2.5 case): The conclusion that the unsupported link is due to 'insufficient propagation from black-carbon perturbation to particulate response and missing diagnostics of vertical absorptive heating' requires demonstration that the AI framework does not systematically omit other key processes such as aerosol-cloud interactions or regional transport; no cross-validation with expert analysis is mentioned.
[Abstract] Abstract: The paper claims this is the 'first WRF-Chem-based multi-agent framework', but without a methods section detailing the agent architecture, prompt engineering, or integration points with WRF-Chem, it is difficult to assess novelty or reproducibility of the autonomous driving of simulations.

minor comments (2)

[Abstract] The term 'auditable' is used repeatedly but not explicitly defined in terms of what outputs (e.g., logs of agent decisions, model configs) make the process traceable by humans.
[Abstract] No information is provided on the computational resources required or the number of simulations run in the case studies, which would help gauge practicality.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments and recommendation for major revision. We address each point below, clarifying details from the full manuscript and indicating where revisions will strengthen the presentation of evidence criteria, scope limitations, and methods.

read point-by-point responses

Referee: [Abstract (ozone case)] The judgment that 'the evidence for ozone response to NOx control to be incomplete' is presented without detailing the specific evidence criteria, thresholds, or how the multi-agent system evaluates completeness (e.g., whether aerosol-radiation-interaction signals in shortwave radiation and boundary-layer height are quantified against expected magnitudes). This is load-bearing for the claim of autonomous mechanism validation.

Authors: We agree the abstract is too terse on evaluation criteria. The full manuscript's Methods section specifies the evidence protocol: the assessor agent quantifies signals via normalized differences in shortwave radiation (>5% threshold) and boundary-layer height (>10% threshold) against control runs, then scores completeness on a 0-1 scale requiring consistency across at least three diagnostics. We will revise the abstract to include a concise statement of these criteria and add an explicit cross-reference to the Methods section. revision: yes
Referee: [Abstract (PM2.5 case)] The conclusion that the unsupported link is due to 'insufficient propagation from black-carbon perturbation to particulate response and missing diagnostics of vertical absorptive heating' requires demonstration that the AI framework does not systematically omit other key processes such as aerosol-cloud interactions or regional transport; no cross-validation with expert analysis is mentioned.

Authors: The referee correctly notes that the current description does not explicitly rule out systematic omissions or include expert cross-validation. The manuscript's Discussion acknowledges the framework evaluates only the user-specified hypothesis set and does not claim exhaustive coverage of all processes. We will add a dedicated Limitations subsection clarifying the targeted scope and stating that expert cross-validation is planned for follow-on work; this addresses the concern without overclaiming completeness. revision: partial
Referee: [Abstract] The paper claims this is the 'first WRF-Chem-based multi-agent framework', but without a methods section detailing the agent architecture, prompt engineering, or integration points with WRF-Chem, it is difficult to assess novelty or reproducibility of the autonomous driving of simulations.

Authors: The full manuscript contains a Methods section (Section 2) that details the three-agent architecture (planner, executor, assessor), the prompt templates used for hypothesis-to-configuration translation and evidence scoring, and the WRF-Chem integration via namelist generation, output parsing scripts, and restart-file handling. We will revise the abstract to reference this section explicitly and expand one paragraph in Methods to include pseudocode for the integration workflow, thereby supporting both the novelty claim and reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: framework presented as new tool without self-referential derivations

full rationale

The paper introduces TianJi-Environ as an autonomous multi-agent system for WRF-Chem simulations and mechanism validation. The abstract and description frame it as a new methodology converting hypotheses into configurations and evidence criteria, with case studies as demonstrations. No equations, fitted parameters, or self-citations are invoked in a load-bearing way that reduces claims to inputs by construction. The central claim rests on the system's operationalization capability rather than any renaming, ansatz smuggling, or prediction-from-fit pattern. This is a standard tool/framework paper with independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on the abstract, the central claim rests on the assumption that the AI framework can perform expert-level validation tasks autonomously.

axioms (1)

domain assumption The WRF-Chem model provides an accurate representation of atmospheric chemistry processes for the cases studied.
The system depends on the underlying numerical model being reliable for mechanism validation.

invented entities (1)

TianJi-Environ multi-agent framework no independent evidence
purpose: Autonomous driving of simulations and evidence evaluation for mechanism validation
New system proposed in the paper with no external validation mentioned.

pith-pipeline@v0.9.1-grok · 5792 in / 1235 out tokens · 30972 ms · 2026-06-27T20:27:59.474129+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 32 canonical work pages

[1]

H., and Pandis, S

Seinfeld, J. H., and Pandis, S. N. (2016). Atmospheric Chemistry and Physics: From Air Pollution to Climate Change. 3rd ed., Wiley

2016
[2]

Jacob, D. J. (1999). Introduction to Atmospheric Chemistry . Princeton University Press

1999
[3]

W., and Schere, K

Byun, D. W., and Schere, K. L. (2006). Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Applied Mechanics Reviews, 59, 51–77. https://doi.org/10.1115/1.2128636

work page doi:10.1115/1.2128636 2006
[4]

A., Peckham, S

Grell, G. A., Peckham, S. E., Schmitz, R., McKeen, S. A., Frost, G., Skamarock, W. C., and Eder, B. (2005). Fully coupled “online” chemistry within the WRF model. Atmospheric Environment, 39, 6957–6975. https://doi.org/10.1016/j.atmosenv.2005.04.027

work page doi:10.1016/j.atmosenv.2005.04.027 2005
[5]

D., Gustafson, W

Fast, J. D., Gustafson, W. I., Easter, R. C., Zaveri, R. A., Barnard, J. C., Chapman, E. G., Grell, G. A., and Peckham, S. E. (2006). Evolution of ozone, particulates, and aerosol direct radiative forcing in the vicinity of Houston using a fully coupled meteorology–chemistry–aerosol model. Journal of Geophysical Research: Atmospheres, 111, D21305. https:/...

work page doi:10.1029/2005jd006721 2006
[6]

C., Klemp, J

Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Liu, Z., Berner, J., Wang, W., Powers, J. G., Duda, M. G., Barker, D., and Huang, X.- Y . (2019). A Description of the Advanced Research WRF Model Version 4. NCAR Technical Note NCAR/TN-556+STR. https://doi.org/10.5065/1dfh-6p97

work page doi:10.5065/1dfh-6p97 2019
[7]

Zhang, Y . (2008). Online-coupled meteorology and chemistry models: history, current status, and outlook. Atmospheric Chemistry and Physics , 8, 2895–2932. https://doi.org/10.5194/acp-8-2895- 2008

work page doi:10.5194/acp-8-2895- 2008
[8]

Baklanov, A., Schlünzen, K., Suppan, P ., Baldasano, J., Brunner, D., Aksoyoglu, S., Carmichael, G., Douros, J., Flemming, J., Forkel, R., et al. (2014). Online coupled regional meteorology chemistry models in Europe: current status and prospects. Atmospheric Chemistry and Physics , 14, 317–398. https://doi.org/10.5194/acp-14-317-2014

work page doi:10.5194/acp-14-317-2014 2014
[9]

Gao, M., Xiu, A., Zhang, X., Tong, D., Zhao, H., Liu, S., Zhang, S., Meng, X., Chen, X., Cai, S., et al. (2022). T wo-way coupled meteorology and air quality models in Asia: a systematic review and meta-analysis of impacts of aerosol feedbacks on meteorology and air quality.Atmospheric Chemistry and Physics, 22, 5265–5329. https://doi.org/10.5194/acp-22-5265-2022

work page doi:10.5194/acp-22-5265-2022 2022
[10]

Y ang, H., Chen, L., Liao, H., Zhu, J., Wang, W., and Li, X. (2022). Impacts of aerosol– photolysis interaction and aerosol–radiation feedback on surface-layer ozone in North China dur- ing multi-pollutant air pollution episodes. Atmospheric Chemistry and Physics , 22, 4101–4116. https://doi.org/10.5194/acp-22-4101-2022. 17

work page doi:10.5194/acp-22-4101-2022 2022
[11]

Li, X., Qin, M., Li, L., Gong, K., Shen, H., Li, J., and Hu, J. (2022). Examining the implica- tions of photochemical indicators for O 3–NO𝑥–VOC sensitivity and control strategies: a case study in the Y angtze River Delta (YRD), China. Atmospheric Chemistry and Physics , 22, 14799–14811. https://doi.org/10.5194/acp-22-14799-2022

work page doi:10.5194/acp-22-14799-2022 2022
[12]

Wu, J., Bei, N., Hu, B., Liu, S., Zhou, M., Wang, Q., Li, X., Liu, L., Feng, T., Liu, Z., et al. (2019). Aerosol–radiation feedback deteriorates the wintertime haze in the North China Plain. Atmospheric Chemistry and Physics , 19, 8703–8719. https://doi.org/10.5194/acp-19-8703-2019

work page doi:10.5194/acp-19-8703-2019 2019
[13]

Li, J., Han, Z., Wu, Y ., Xiong, Z., Xia, X., Li, J., Liang, L., and Zhang, R. (2020). Aerosol radiative effects and feedbacks on boundary layer meteorology and PM2.5 chemical components during winter haze events over the Beijing–Tianjin–Hebei region. Atmospheric Chemistry and Physics , 20, 8659–

2020
[14]

https://doi.org/10.5194/acp-20-8659-2020

work page doi:10.5194/acp-20-8659-2020 2020
[15]

J., Sun, J

Petäjä, T., Järvi, L., Kerminen, V .-M., Ding, A. J., Sun, J. N., Nie, W., Kujansuu, J., Virkkula, A., Y ang, X., Fu, C. B., Zilitinkevich, S., and Kulmala, M. (2016). Enhanced air pollution via aerosol- boundary layer feedback in China. Scientific Reports, 6, 18998. https://doi.org/10.1038/srep18998

work page doi:10.1038/srep18998 2016
[16]

J., Huang, X., Nie, W., Sun, J

Ding, A. J., Huang, X., Nie, W., Sun, J. N., Kerminen, V .-M., Petäjä, T., Su, H., Cheng, Y . F., Y ang, X.-Q., Wang, M. H., et al. (2016). Enhanced haze pollution by black carbon in megacities in China. Geophysical Research Letters, 43, 2873–2879. https://doi.org/10.1002/2016GL067745

work page doi:10.1002/2016gl067745 2016
[17]

Wang, Z., Huang, X., and Ding, A. (2018). Dome effect of black carbon and its key influencing factors: a one-dimensional modelling study. Atmospheric Chemistry and Physics , 18, 2821–2834. https://doi.org/10.5194/acp-18-2821-2018

work page doi:10.5194/acp-18-2821-2018 2018
[18]

Sillman, S. (1995). The use of NO 𝑦, H 2O2, and HNO 3 as indicators for ozone–NO 𝑥– hydrocarbon sensitivity in urban locations.Journal of Geophysical Research, 100(D7), 14175–14188. https://doi.org/10.1029/94JD02953

work page doi:10.1029/94jd02953 1995
[19]

Sillman, S. (1999). The relation between ozone, NO 𝑥 and hydrocarbons in urban and pol- luted rural environments. Atmospheric Environment, 33, 1821–1845. https://doi.org/10.1016/S1352- 2310(98)00345-8

work page doi:10.1016/s1352- 1999
[20]

N., Y oshida, Y ., Olson, J

Duncan, B. N., Y oshida, Y ., Olson, J. R., Sillman, S., Martin, R. V ., Lamsal, L., Hu, Y ., Pickering, K. E., Retscher, C., Allen, D. J., and Crawford, J. H. (2010). Application of OMI observations to a space- based indicator of NO 𝑥 and VOC controls on surface ozone formation. Atmospheric Environment, 44, 2213–2223. https://doi.org/10.1016/j.atmosenv.2...

work page doi:10.1016/j.atmosenv.2010.03.010 2010
[21]

Jin, X., and Holloway, T. (2015). Spatial and temporal variability of ozone sensitivity over China observed from the Ozone Monitoring Instrument. Journal of Geophysical Research: Atmospheres , 120, 7229–7246. https://doi.org/10.1002/2015JD023250

work page doi:10.1002/2015jd023250 2015
[22]

Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., and Tian, Q. (2023). Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619, 533–538. https://doi.org/10.1038/s41586- 023-06185-3

work page doi:10.1038/s41586- 2023
[23]

Lam, R., Sanchez-Gonzalez, A., Willson, M., Wirnsberger, P ., Fortunato, M., Alet, F., Ravuri, S., Ewalds, T., Eaton-Rosen, Z., Hu, W., et al. (2023). Learning skillful medium-range global weather forecasting. Science, 382, 1416–1421. https://doi.org/10.1126/science.adi2336

work page doi:10.1126/science.adi2336 2023
[24]

Price, A

Price, I., Sanchez-Gonzalez, A., Alet, F., Andersson, T. R., El-Kadi, A., Masters, D., Ewalds, T., Stott, J., Mohamed, S., Battaglia, P ., Lam, R., and Willson, M. (2025). Probabilistic weather forecasting with machine learning. Nature, 637, 84–90. https://doi.org/10.1038/s41586-024-08252-9. 18

work page doi:10.1038/s41586-024-08252-9 2025
[25]

Bodnar, W

Bodnar, C., Bruinsma, W. P ., Lucic, A., Stanley, M., Allen, A., Brandstetter, J., Garvan, P ., Riechert, M., Weyn, J. A., Dong, H., et al. (2025). A foundation model for the Earth system. Nature, 641, 1180–1187. https://doi.org/10.1038/s41586-025-09005-y

work page doi:10.1038/s41586-025-09005-y 2025
[26]

Gui, K. et al. (2026). Advancing operational global aerosol forecasting with machine learning.Nature, 651, 658–665. https://doi.org/10.1038/s41586-026-10234-y

work page doi:10.1038/s41586-026-10234-y 2026
[27]

H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V

Karpatne, A., Atluri, G., Faghmous, J. H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V . (2017). Theory-guided data science: a new paradigm for scien- tific discovery from data. IEEE Transactions on Knowledge and Data Engineering , 29, 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168

work page doi:10.1109/tkde.2017.2720168 2017
[28]

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., and Prabhat. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195–204. https://doi.org/10.1038/s41586-019-0912-1

work page doi:10.1038/s41586-019-0912-1 2019
[29]

Guo, Z., Wang, J., Ling, F., Wei, W., Yue, X., Jiang, Z., Xu, W., Luo, J.-J., Cheng, L., Ham, Y .-G., et al. (2025). A self-evolving AI agent system for climate science. arXiv preprint arXiv:2507.17311. https://doi.org/10.48550/arXiv.2507.17311

work page doi:10.48550/arxiv.2507.17311 2025
[30]

Feng, P ., Lv, Z., Y e, J., Wang, X., Huo, X., Yu, J., Xu, W., Zhang, W., Bai, L., He, C., and Li, W. (2025). Earth-Agent: Unlocking the full landscape of Earth observation with agents. arXiv preprint arXiv:2509.23141. https://doi.org/10.48550/arXiv.2509.23141

work page doi:10.48550/arxiv.2509.23141 2025
[31]

Brown, T. B. et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901

2020
[32]

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837

2022
[33]

L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al. (2023). GPT-4 technical report. arXiv preprint arXiv:2303.08774

Pith/arXiv arXiv 2023
[34]

Y ao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y . (2023). ReAct: Synergizing reasoning and acting in language models. International Conference on Learning Representations

2023
[35]

Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., and Y ao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems , 36, 8634– 8652

2023
[36]

Wang, L., Ma, C., Feng, X., Zhang, Z., Y ang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y ., et al. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18, 186345. https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024
[37]

Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes

Boiko, D. A., MacKnight, R., Kline, B., and Gomes, G. (2023). Autonomous chemical research with large language models. Nature, 624, 570–578. https://doi.org/10.1038/s41586-023-06792-0

work page doi:10.1038/s41586-023-06792-0 2023
[38]

Bran, Sam Cox, Oliver Schilter, et al

Bran, A. M., Cox, S., Schilter, O., Baldassari, C., White, A. D., and Schwaller, P . (2024). Aug- menting large language models with chemistry tools. Nature Machine Intelligence , 6, 525–535. https://doi.org/10.1038/s42256-024-00832-8

work page doi:10.1038/s42256-024-00832-8 2024
[39]

Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y ., Zhang, C., Wang, J., Wang, Z., Y au, S. K. S., Lin, Z., et al. (2023). MetaGPT: Meta programming for a multi-agent collaborative framework. arXiv preprint arXiv:2308.00352. 19

Pith/arXiv arXiv 2023
[40]

Wu, Q., Bansal, G., Zhang, J., Wu, Y ., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., et al. (2023). AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv preprint arXiv:2308.08155

Pith/arXiv arXiv 2023
[41]

T., Foerster, J., Clune, J., and Ha, D

Lu, C., Lu, C., Lange, R. T., Foerster, J., Clune, J., and Ha, D. (2024). The AI Scientist: Towards fully automated open-ended scientific discovery. arXiv preprint arXiv:2408.06292

Pith/arXiv arXiv 2024
[42]

Ghafarollahi, A., and Buehler, M. J. (2025). SciAgents: Automating scientific discovery through bioinspired multi-agent intelligent graph reasoning. Advanced Materials , 37, 2413523. https://doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025
[43]

Wang, H., Fu, T., Du, Y ., Gao, W., Huang, K., Liu, Z., Chandak, P ., Liu, S., Van Katwyk, P ., Deac, A., et al. (2023). Scientific discovery in the age of artificial intelligence. Nature, 620, 47–60. https://doi.org/10.1038/s41586-023-06221-2. 20

work page doi:10.1038/s41586-023-06221-2 2023

[1] [1]

H., and Pandis, S

Seinfeld, J. H., and Pandis, S. N. (2016). Atmospheric Chemistry and Physics: From Air Pollution to Climate Change. 3rd ed., Wiley

2016

[2] [2]

Jacob, D. J. (1999). Introduction to Atmospheric Chemistry . Princeton University Press

1999

[3] [3]

W., and Schere, K

Byun, D. W., and Schere, K. L. (2006). Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Applied Mechanics Reviews, 59, 51–77. https://doi.org/10.1115/1.2128636

work page doi:10.1115/1.2128636 2006

[4] [4]

A., Peckham, S

Grell, G. A., Peckham, S. E., Schmitz, R., McKeen, S. A., Frost, G., Skamarock, W. C., and Eder, B. (2005). Fully coupled “online” chemistry within the WRF model. Atmospheric Environment, 39, 6957–6975. https://doi.org/10.1016/j.atmosenv.2005.04.027

work page doi:10.1016/j.atmosenv.2005.04.027 2005

[5] [5]

D., Gustafson, W

Fast, J. D., Gustafson, W. I., Easter, R. C., Zaveri, R. A., Barnard, J. C., Chapman, E. G., Grell, G. A., and Peckham, S. E. (2006). Evolution of ozone, particulates, and aerosol direct radiative forcing in the vicinity of Houston using a fully coupled meteorology–chemistry–aerosol model. Journal of Geophysical Research: Atmospheres, 111, D21305. https:/...

work page doi:10.1029/2005jd006721 2006

[6] [6]

C., Klemp, J

Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Liu, Z., Berner, J., Wang, W., Powers, J. G., Duda, M. G., Barker, D., and Huang, X.- Y . (2019). A Description of the Advanced Research WRF Model Version 4. NCAR Technical Note NCAR/TN-556+STR. https://doi.org/10.5065/1dfh-6p97

work page doi:10.5065/1dfh-6p97 2019

[7] [7]

Zhang, Y . (2008). Online-coupled meteorology and chemistry models: history, current status, and outlook. Atmospheric Chemistry and Physics , 8, 2895–2932. https://doi.org/10.5194/acp-8-2895- 2008

work page doi:10.5194/acp-8-2895- 2008

[8] [8]

Baklanov, A., Schlünzen, K., Suppan, P ., Baldasano, J., Brunner, D., Aksoyoglu, S., Carmichael, G., Douros, J., Flemming, J., Forkel, R., et al. (2014). Online coupled regional meteorology chemistry models in Europe: current status and prospects. Atmospheric Chemistry and Physics , 14, 317–398. https://doi.org/10.5194/acp-14-317-2014

work page doi:10.5194/acp-14-317-2014 2014

[9] [9]

Gao, M., Xiu, A., Zhang, X., Tong, D., Zhao, H., Liu, S., Zhang, S., Meng, X., Chen, X., Cai, S., et al. (2022). T wo-way coupled meteorology and air quality models in Asia: a systematic review and meta-analysis of impacts of aerosol feedbacks on meteorology and air quality.Atmospheric Chemistry and Physics, 22, 5265–5329. https://doi.org/10.5194/acp-22-5265-2022

work page doi:10.5194/acp-22-5265-2022 2022

[10] [10]

Y ang, H., Chen, L., Liao, H., Zhu, J., Wang, W., and Li, X. (2022). Impacts of aerosol– photolysis interaction and aerosol–radiation feedback on surface-layer ozone in North China dur- ing multi-pollutant air pollution episodes. Atmospheric Chemistry and Physics , 22, 4101–4116. https://doi.org/10.5194/acp-22-4101-2022. 17

work page doi:10.5194/acp-22-4101-2022 2022

[11] [11]

Li, X., Qin, M., Li, L., Gong, K., Shen, H., Li, J., and Hu, J. (2022). Examining the implica- tions of photochemical indicators for O 3–NO𝑥–VOC sensitivity and control strategies: a case study in the Y angtze River Delta (YRD), China. Atmospheric Chemistry and Physics , 22, 14799–14811. https://doi.org/10.5194/acp-22-14799-2022

work page doi:10.5194/acp-22-14799-2022 2022

[12] [12]

Wu, J., Bei, N., Hu, B., Liu, S., Zhou, M., Wang, Q., Li, X., Liu, L., Feng, T., Liu, Z., et al. (2019). Aerosol–radiation feedback deteriorates the wintertime haze in the North China Plain. Atmospheric Chemistry and Physics , 19, 8703–8719. https://doi.org/10.5194/acp-19-8703-2019

work page doi:10.5194/acp-19-8703-2019 2019

[13] [13]

Li, J., Han, Z., Wu, Y ., Xiong, Z., Xia, X., Li, J., Liang, L., and Zhang, R. (2020). Aerosol radiative effects and feedbacks on boundary layer meteorology and PM2.5 chemical components during winter haze events over the Beijing–Tianjin–Hebei region. Atmospheric Chemistry and Physics , 20, 8659–

2020

[14] [14]

https://doi.org/10.5194/acp-20-8659-2020

work page doi:10.5194/acp-20-8659-2020 2020

[15] [15]

J., Sun, J

Petäjä, T., Järvi, L., Kerminen, V .-M., Ding, A. J., Sun, J. N., Nie, W., Kujansuu, J., Virkkula, A., Y ang, X., Fu, C. B., Zilitinkevich, S., and Kulmala, M. (2016). Enhanced air pollution via aerosol- boundary layer feedback in China. Scientific Reports, 6, 18998. https://doi.org/10.1038/srep18998

work page doi:10.1038/srep18998 2016

[16] [16]

J., Huang, X., Nie, W., Sun, J

Ding, A. J., Huang, X., Nie, W., Sun, J. N., Kerminen, V .-M., Petäjä, T., Su, H., Cheng, Y . F., Y ang, X.-Q., Wang, M. H., et al. (2016). Enhanced haze pollution by black carbon in megacities in China. Geophysical Research Letters, 43, 2873–2879. https://doi.org/10.1002/2016GL067745

work page doi:10.1002/2016gl067745 2016

[17] [17]

Wang, Z., Huang, X., and Ding, A. (2018). Dome effect of black carbon and its key influencing factors: a one-dimensional modelling study. Atmospheric Chemistry and Physics , 18, 2821–2834. https://doi.org/10.5194/acp-18-2821-2018

work page doi:10.5194/acp-18-2821-2018 2018

[18] [18]

Sillman, S. (1995). The use of NO 𝑦, H 2O2, and HNO 3 as indicators for ozone–NO 𝑥– hydrocarbon sensitivity in urban locations.Journal of Geophysical Research, 100(D7), 14175–14188. https://doi.org/10.1029/94JD02953

work page doi:10.1029/94jd02953 1995

[19] [19]

Sillman, S. (1999). The relation between ozone, NO 𝑥 and hydrocarbons in urban and pol- luted rural environments. Atmospheric Environment, 33, 1821–1845. https://doi.org/10.1016/S1352- 2310(98)00345-8

work page doi:10.1016/s1352- 1999

[20] [20]

N., Y oshida, Y ., Olson, J

Duncan, B. N., Y oshida, Y ., Olson, J. R., Sillman, S., Martin, R. V ., Lamsal, L., Hu, Y ., Pickering, K. E., Retscher, C., Allen, D. J., and Crawford, J. H. (2010). Application of OMI observations to a space- based indicator of NO 𝑥 and VOC controls on surface ozone formation. Atmospheric Environment, 44, 2213–2223. https://doi.org/10.1016/j.atmosenv.2...

work page doi:10.1016/j.atmosenv.2010.03.010 2010

[21] [21]

Jin, X., and Holloway, T. (2015). Spatial and temporal variability of ozone sensitivity over China observed from the Ozone Monitoring Instrument. Journal of Geophysical Research: Atmospheres , 120, 7229–7246. https://doi.org/10.1002/2015JD023250

work page doi:10.1002/2015jd023250 2015

[22] [22]

Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., and Tian, Q. (2023). Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619, 533–538. https://doi.org/10.1038/s41586- 023-06185-3

work page doi:10.1038/s41586- 2023

[23] [23]

Lam, R., Sanchez-Gonzalez, A., Willson, M., Wirnsberger, P ., Fortunato, M., Alet, F., Ravuri, S., Ewalds, T., Eaton-Rosen, Z., Hu, W., et al. (2023). Learning skillful medium-range global weather forecasting. Science, 382, 1416–1421. https://doi.org/10.1126/science.adi2336

work page doi:10.1126/science.adi2336 2023

[24] [24]

Price, A

Price, I., Sanchez-Gonzalez, A., Alet, F., Andersson, T. R., El-Kadi, A., Masters, D., Ewalds, T., Stott, J., Mohamed, S., Battaglia, P ., Lam, R., and Willson, M. (2025). Probabilistic weather forecasting with machine learning. Nature, 637, 84–90. https://doi.org/10.1038/s41586-024-08252-9. 18

work page doi:10.1038/s41586-024-08252-9 2025

[25] [25]

Bodnar, W

Bodnar, C., Bruinsma, W. P ., Lucic, A., Stanley, M., Allen, A., Brandstetter, J., Garvan, P ., Riechert, M., Weyn, J. A., Dong, H., et al. (2025). A foundation model for the Earth system. Nature, 641, 1180–1187. https://doi.org/10.1038/s41586-025-09005-y

work page doi:10.1038/s41586-025-09005-y 2025

[26] [26]

Gui, K. et al. (2026). Advancing operational global aerosol forecasting with machine learning.Nature, 651, 658–665. https://doi.org/10.1038/s41586-026-10234-y

work page doi:10.1038/s41586-026-10234-y 2026

[27] [27]

H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V

Karpatne, A., Atluri, G., Faghmous, J. H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V . (2017). Theory-guided data science: a new paradigm for scien- tific discovery from data. IEEE Transactions on Knowledge and Data Engineering , 29, 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168

work page doi:10.1109/tkde.2017.2720168 2017

[28] [28]

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., and Prabhat. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195–204. https://doi.org/10.1038/s41586-019-0912-1

work page doi:10.1038/s41586-019-0912-1 2019

[29] [29]

Guo, Z., Wang, J., Ling, F., Wei, W., Yue, X., Jiang, Z., Xu, W., Luo, J.-J., Cheng, L., Ham, Y .-G., et al. (2025). A self-evolving AI agent system for climate science. arXiv preprint arXiv:2507.17311. https://doi.org/10.48550/arXiv.2507.17311

work page doi:10.48550/arxiv.2507.17311 2025

[30] [30]

Feng, P ., Lv, Z., Y e, J., Wang, X., Huo, X., Yu, J., Xu, W., Zhang, W., Bai, L., He, C., and Li, W. (2025). Earth-Agent: Unlocking the full landscape of Earth observation with agents. arXiv preprint arXiv:2509.23141. https://doi.org/10.48550/arXiv.2509.23141

work page doi:10.48550/arxiv.2509.23141 2025

[31] [31]

Brown, T. B. et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901

2020

[32] [32]

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837

2022

[33] [33]

L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al. (2023). GPT-4 technical report. arXiv preprint arXiv:2303.08774

Pith/arXiv arXiv 2023

[34] [34]

Y ao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y . (2023). ReAct: Synergizing reasoning and acting in language models. International Conference on Learning Representations

2023

[35] [35]

Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., and Y ao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems , 36, 8634– 8652

2023

[36] [36]

Wang, L., Ma, C., Feng, X., Zhang, Z., Y ang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y ., et al. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18, 186345. https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024

[37] [37]

Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes

Boiko, D. A., MacKnight, R., Kline, B., and Gomes, G. (2023). Autonomous chemical research with large language models. Nature, 624, 570–578. https://doi.org/10.1038/s41586-023-06792-0

work page doi:10.1038/s41586-023-06792-0 2023

[38] [38]

Bran, Sam Cox, Oliver Schilter, et al

Bran, A. M., Cox, S., Schilter, O., Baldassari, C., White, A. D., and Schwaller, P . (2024). Aug- menting large language models with chemistry tools. Nature Machine Intelligence , 6, 525–535. https://doi.org/10.1038/s42256-024-00832-8

work page doi:10.1038/s42256-024-00832-8 2024

[39] [39]

Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y ., Zhang, C., Wang, J., Wang, Z., Y au, S. K. S., Lin, Z., et al. (2023). MetaGPT: Meta programming for a multi-agent collaborative framework. arXiv preprint arXiv:2308.00352. 19

Pith/arXiv arXiv 2023

[40] [40]

Wu, Q., Bansal, G., Zhang, J., Wu, Y ., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., et al. (2023). AutoGen: Enabling next-gen LLM applications via multi-agent conversation. arXiv preprint arXiv:2308.08155

Pith/arXiv arXiv 2023

[41] [41]

T., Foerster, J., Clune, J., and Ha, D

Lu, C., Lu, C., Lange, R. T., Foerster, J., Clune, J., and Ha, D. (2024). The AI Scientist: Towards fully automated open-ended scientific discovery. arXiv preprint arXiv:2408.06292

Pith/arXiv arXiv 2024

[42] [42]

Ghafarollahi, A., and Buehler, M. J. (2025). SciAgents: Automating scientific discovery through bioinspired multi-agent intelligent graph reasoning. Advanced Materials , 37, 2413523. https://doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025

[43] [43]

Wang, H., Fu, T., Du, Y ., Gao, W., Huang, K., Liu, Z., Chandak, P ., Liu, S., Van Katwyk, P ., Deac, A., et al. (2023). Scientific discovery in the age of artificial intelligence. Nature, 620, 47–60. https://doi.org/10.1038/s41586-023-06221-2. 20

work page doi:10.1038/s41586-023-06221-2 2023