Disentangling the effects of sea surface temperature and CO₂ in global machine learned weather-climate emulators
Pith reviewed 2026-06-27 19:12 UTC · model grok-4.3
The pith
Training climate emulators on data where sea surface temperature and CO2 vary independently enables accurate simulation of previously inaccessible scenarios like AMIP +4 K and abrupt 4xCO2.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Trained on a balance of AMIP, equilibrium-climate, and random-CO2 data where SST and CO2 vary independently, together with a total energy conservation constraint, the resulting model accurately emulates its reference model not only in the scenarios where earlier versions succeeded but also in AMIP +4 K and slab-ocean-coupled abrupt 4xCO2 cases where they produced unphysical behavior.
What carries the argument
Random-CO2 reference simulations in which sea surface temperature and CO2 are prescribed to vary independently, breaking their correlation in prior training datasets.
If this is right
- The emulator reproduces reference-model output in AMIP +4 K scenarios.
- The emulator reproduces reference-model output under slab-ocean-coupled abrupt 4xCO2 forcing.
- The model is more data-efficient than predecessors while maintaining accuracy in standard AMIP and equilibrium-climate regimes.
- Enforcing total energy conservation improves interpretability of the emulator's response to separate SST and CO2 changes.
Where Pith is reading between the lines
- The same independent-variation training strategy could be tested on other pairs of correlated climate drivers such as aerosol loading and temperature.
- Extending the method to include interactive ocean or land components would test whether the disentangling benefit survives when more Earth-system feedbacks are active.
- The energy-conservation constraint might be combined with additional physical constraints to further reduce drift in long integrations.
Load-bearing premise
Prescribing SST and CO2 to vary independently in the random-CO2 simulations is enough to overcome the correlation problem and let the model learn their separate effects.
What would settle it
Running the new emulator on AMIP +4 K or abrupt 4xCO2 forcing and observing the same unphysical behavior seen in earlier models would show the central claim does not hold.
Figures
read the original abstract
While previous versions of the Ai2 Climate Emulator (ACE) have been trained with CO$_2$ as a forcing, they are only accurate within a narrow range of scenarios, for example climate over the last 80 years forced by observed sea surface temperature (SST), sea ice, and CO$_2$ (AMIP), or equilibrium or near-equilibrium climates with CO$_2$ concentrations ranging from 1x to 4x that of the present day. Attempting to simulate climate forced by AMIP SST perturbed by +4 K or the response to an abrupt quadrupling of CO$_2$, results in unphysical behavior. We attribute this to these models being trained on datasets where the SST and CO$_2$ are correlated, limiting their ability to accurately learn their separate effects. In this study we introduce a new class of "random-CO$_2$" reference simulations where the SST and CO$_2$ are prescribed to vary independently. Trained on a balance of AMIP, equilibrium-climate, and random-CO$_2$ data, and including a total energy conservation constraint for improved interpretability, we present a more data-efficient model that not only accurately emulates its reference model in scenarios in which previous models excelled, but also scenarios like AMIP +4 K and slab-ocean-coupled abrupt 4xCO$_2$ where they did not. Limitations are that it has simplified or prescribed representations of other Earth system components like the ocean, land, and sea ice; does not expose other known climate drivers as forcings; and relies solely on physics-based model output for training data, inheriting the biases relative to observations thereof. Each of these represent opportunities for future work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that previous ACE emulators fail in out-of-distribution scenarios (AMIP +4 K, abrupt 4xCO2) because training data had correlated SST and CO2; it introduces random-CO2 reference simulations where SST and CO2 are prescribed independently, trains a balanced mixture of AMIP, equilibrium, and random-CO2 data, adds a total energy conservation constraint, and reports improved emulation accuracy and physical consistency in the previously failing regimes.
Significance. If the quantitative results hold and the disentanglement is demonstrated, the work would address a recognized limitation in ML climate emulators—the inability to generalize when forcings are decorrelated—while adding a physically motivated constraint. The data-generation strategy and energy constraint are concrete, reproducible contributions that could be adopted by other emulator efforts.
major comments (2)
- [Abstract / §3] Abstract and §3 (training data and model description): the central claim that independent prescription of SST and CO2 in the random-CO2 runs is sufficient to let the network learn their separate effects rests on an untested assumption. The emulator receives joint atmospheric states as input; without an explicit architectural separation (e.g., separate forcing channels) or auxiliary loss (e.g., counterfactual or attribution terms), the network could still learn only joint mappings. The energy constraint improves conservation but does not isolate the two forcings. This is load-bearing for the generalization results in AMIP+4 K and abrupt 4xCO2.
- [Results] Results section (quantitative validation): the abstract states improved performance but supplies no error metrics, skill scores, or uncertainty ranges for the new scenarios. Without these numbers (and comparison to the prior ACE versions on the same test cases), it is impossible to judge whether the claimed improvement is statistically meaningful or merely qualitative.
minor comments (2)
- [Abstract] The limitations paragraph is appropriately candid; it could be expanded with a short discussion of how the simplified ocean/land/sea-ice representations might still induce biases even after the SST/CO2 disentanglement step.
- [Methods] Notation for the energy constraint should be defined explicitly (e.g., which fluxes are included in the total energy residual) so that readers can reproduce the loss term.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which highlight important aspects of our methodology and presentation. We address each major comment below and will revise the manuscript to improve clarity and completeness.
read point-by-point responses
-
Referee: [Abstract / §3] Abstract and §3 (training data and model description): the central claim that independent prescription of SST and CO2 in the random-CO2 runs is sufficient to let the network learn their separate effects rests on an untested assumption. The emulator receives joint atmospheric states as input; without an explicit architectural separation (e.g., separate forcing channels) or auxiliary loss (e.g., counterfactual or attribution terms), the network could still learn only joint mappings. The energy constraint improves conservation but does not isolate the two forcings. This is load-bearing for the generalization results in AMIP+4 K and abrupt 4xCO2.
Authors: We agree that our method does not include explicit architectural separation or auxiliary losses to enforce disentanglement, and that the energy constraint addresses conservation rather than isolation of forcings. The approach instead relies on exposing the model to training data with independently varying SST and CO2 via the random-CO2 simulations, which breaks the correlations present in standard datasets. The improved performance in decoupled scenarios provides empirical support, but we acknowledge this remains a data-driven assumption rather than a mechanistically proven separation. We will revise §3 to explicitly discuss this assumption, its limitations, and implications for generalization. We will also add a brief ablation comparing performance with and without the random-CO2 data to better quantify its role. revision: partial
-
Referee: [Results] Results section (quantitative validation): the abstract states improved performance but supplies no error metrics, skill scores, or uncertainty ranges for the new scenarios. Without these numbers (and comparison to the prior ACE versions on the same test cases), it is impossible to judge whether the claimed improvement is statistically meaningful or merely qualitative.
Authors: The results section of the manuscript includes quantitative validation and comparisons to prior ACE versions, but we accept that the abstract lacks specific metrics. We will revise the abstract to report key error metrics (such as RMSE for temperature, humidity, and wind fields), skill scores, and uncertainty ranges for the AMIP+4 K and abrupt 4xCO2 scenarios, with explicit side-by-side comparisons to previous ACE emulators on the same test cases. revision: yes
Circularity Check
No significant circularity; improvements rely on new independent data and constraint
full rationale
The paper's central claim rests on generating new 'random-CO2' reference simulations where SST and CO2 vary independently (explicitly stated in the abstract), training a neural network emulator on a balanced mix of these plus AMIP and equilibrium data, and adding a total energy conservation constraint. Performance gains on AMIP+4K and abrupt 4xCO2 scenarios are presented as empirical outcomes of this expanded training distribution and constraint, not as quantities derived by construction from fitted parameters or prior self-citations. No equations or steps reduce the claimed disentanglement to a renaming, self-definition, or load-bearing self-citation chain. The architecture receives joint states but the separation is attributed to the data design itself, which is externally generated and falsifiable. This is a standard data-driven ML setup with no load-bearing circular reductions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption SST and CO2 correlation in training data limits ability to learn separate effects
- domain assumption Total energy conservation constraint improves interpretability
Reference graph
Works this paper leans on
-
[1]
, year = 2022, month = mar, journal =
Beusch, Lea and Nicholls, Zebedee and Gudmundsson, Lukas and Hauser, Mathias and Meinshausen, Malte and Seneviratne, Sonia I. , year = 2022, month = mar, journal =. From emission scenarios to spatially resolved projections with a chain of computationally efficient emulators: coupling of. doi:10.5194/gmd-15-2085-2022 , urldate =
-
[2]
Bloxam, Kevin and Huang, Yi , year = 2021, month = jan, journal =. Radiative. doi:10.1175/JAS-D-20-0015.1 , urldate =
-
[3]
Bonev, Boris and Kurth, Thorsten and Hundt, Christian and Pathak, Jaideep and Baust, Maximilian and Kashinath, Karthik and Anandkumar, Anima , year = 2023, month = jun, eprint =. Spherical. doi:10.48550/arXiv.2306.03838 , urldate =
-
[4]
Bonev, Boris and Kurth, Thorsten and Mahesh, Ankur and Bisson, Mauro and Kossaifi, Jean and Kashinath, Karthik and Anandkumar, Anima and Collins, William D. and Pritchard, Michael S. and Keller, Alexander , year = 2025, month = jul, publisher =. doi:10.48550/arXiv.2507.12144 , urldate =
-
[5]
Journal of Geophysical Research: Atmospheres , volume =
A methodology for understanding and intercomparing atmospheric climate feedback processes in general circulation models , author =. Journal of Geophysical Research: Atmospheres , volume =. doi:10.1029/JD093iD07p08305 , urldate =
-
[6]
Chapman, William E. and Schreck, John S. and Sha, Yingkai and Gagne, David John and Kimpara, Dhamma and Zanna, Laure and Mayer, Kirsten J. and Berner, Judith , year = 2025, month = apr, eprint =. doi:10.48550/arXiv.2504.06007 , urldate =
-
[7]
Cheng, Kai-Yuan and Harris, Lucas and Bretherton, Christopher and Merlis, Timothy M. and Bolot, Maximilien and Zhou, Linjiong and Kaltenbaugh, Alex and Clark, Spencer and Fueglistaler, Stephan , year = 2022, month = aug, journal =. Impact of. doi:10.1029/2022GL099796 , urldate =
-
[8]
Chien, Mu-Ting and Barnes, Elizabeth A and Maloney, Eric D , year = 2025, month = sep, journal =. Modulation of tropical cyclogenesis on subseasonal-to-interannual timescales in the deep-learning climate emulator. doi:10.1088/3049-4753/adfd61 , urldate =
-
[9]
Chien, Mu-Ting and Barnes, Elizabeth A. and Maloney, Eric D. , year = 2026, journal =. Modulation of. doi:10.1029/2025GL117387 , urldate =
-
[10]
Clark, Spencer K. and. Journal of Geophysical Research: Machine Learning and Computation , volume =. doi:10.1029/2024JH000575 , urldate =
-
[11]
doi:10.1126/sciadv.adx2372 , urldate =
Couairon, Guillaume and Singh, Renu and Charantonis, Anastase and Lessig, Christian and Monteleoni, Claire , year = 2026, month = apr, journal =. doi:10.1126/sciadv.adx2372 , urldate =
-
[12]
A. AGU Advances , volume =. doi:10.1029/2025AV001706 , urldate =
-
[13]
Duncan, James P. C. and Wu, Elynn and Golaz, Jean-Christophe and Caldwell, Peter M. and. Application of the. Journal of Geophysical Research: Machine Learning and Computation , volume =. doi:10.1029/2024JH000136 , urldate =
-
[14]
Duncan, James P. C. and Wu, Elynn and Dheeshjith, Surya and Subel, Adam and Arcomano, Troy and Clark, Spencer K. and Henn, Brian and Kwa, Anna and McGibbon, Jeremy and Perkins, W. Andre and Gregory, William and. Geophysical Research Letters , volume =. doi:10.1029/2025GL119340 , urldate =
-
[15]
Duncan, James P. C. and Wu, Elynn and Dheeshjith, Surya and Subel, Adam and Arcomano, Troy and Clark, Spencer K. and Henn, Brian and Kwa, Anna and McGibbon, Jeremy and Perkins, W. Andre and Gregory, William and. doi:10.48550/arXiv.2509.12490 , urldate =. arXiv , keywords =:2509.12490 , primaryclass =
-
[16]
Eyring, Veronika and Bony, Sandrine and Meehl, Gerald A. and Senior, Catherine A. and Stevens, Bjorn and Stouffer, Ronald J. and Taylor, Karl E. , year = 2016, month = may, journal =. Overview of the. doi:10.5194/gmd-9-1937-2016 , urldate =
-
[17]
Gregory, William and Bushuk, Mitchell and Duncan, James and Wu, Elynn and Subel, Adam and Clark, Spencer K. and Hurlin, Bill and. doi:10.48550/arXiv.2603.12449 , urldate =. arXiv , keywords =:2603.12449 , primaryclass =
-
[18]
and Hazelton, Andrew and Huff, J
Harris, Lucas and Zhou, Linjiong and Lin, Shian-Jiann and Chen, Jan-Huey and Chen, Xi and Gao, Kun and Morin, Matthew and Rees, Shannon and Sun, Yongqiang and Tong, Mingjing and Xiang, Baoqiang and Bender, Morris and Benson, Rusty and Cheng, Kai-Yuan and Clark, Spencer and Elbert, Oliver D. and Hazelton, Andrew and Huff, J. Jacob and Kaltenbaugh, Alex and...
-
[19]
Harris, Lucas and Zhou, Linjiong and Kaltenbaugh, Alex and Clark, Spencer and Cheng, Kai-Yuan and Bretherton, Chris , year = 2023, journal =. A. doi:10.1029/2022JD037823 , urldate =
-
[20]
AIMIP Phase 1: systematic evaluations of AI weather and climate models
Henn, Brian and Bretherton, Christopher S. and Kodunov, Nikolay and Lessig, Christian and Molina, Maria J. and Arcomano, Troy and. doi:10.48550/arXiv.2605.06944 , urldate =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.06944
-
[21]
Hersbach, Hans and Bell, Bill and Berrisford, Paul and Hirahara, Shoji and Hor. The. Quarterly Journal of the Royal Meteorological Society , volume =. doi:10.1002/qj.3803 , urldate =
-
[22]
Huang, Yi and Bani Shahabadi, Maziar , year = 2014, month = nov, journal =. Why logarithmic?. doi:10.1002/2014JD022466 , urldate =
-
[23]
Kaltenborn, Julia and Lange, Charlotte E. E. and Ramesh, Venkatesh and Brouillard, Philippe and Gurwicz, Yaniv and Nagda, Chandni and Runge, Jakob and Nowack, Peer and Rolnick, David , year = 2023, month = nov, urldate =
2023
-
[24]
Tropospheric adjustment to increasing
Kamae, Youichi and Watanabe, Masahiro , year = 2013, month = dec, journal =. Tropospheric adjustment to increasing. doi:10.1007/s00382-012-1555-1 , urldate =
-
[25]
Kamae, Youichi and Watanabe, Masahiro and Ogura, Tomoo and Yoshimori, Masakazu and Shiogama, Hideo , year = 2015, month = jun, journal =. Rapid. doi:10.1007/s40641-015-0007-5 , urldate =
-
[26]
Kiehl, Jeffrey T. and Shields, Christine A. and Hack, James J. and Collins, William D. , year = 2006, month = jun, journal =. The. doi:10.1175/JCLI3747.1 , urldate =
-
[27]
Neural general circulation models for weather and climate , volume =
Neural general circulation models for weather and climate , author =. Nature , volume =. doi:10.1038/s41586-024-07744-y , urldate =
-
[28]
Lang, Simon and Alexe, Mihai and Clare, Mariana C. A. and Roberts, Christopher and Adewoyin, Rilwan and Bouall. doi:10.48550/arXiv.2412.15832 , urldate =. arXiv , keywords =:2412.15832 , primaryclass =
-
[29]
Landsberg, Jacob B. and Barnes, Elizabeth A. , year = 2026, journal =. Forecasting the. doi:10.1029/2025GL119740 , urldate =
-
[30]
Lang, Simon and Alexe, Mihai and Clare, Mariana C. A. and Roberts, Christopher and Adewoyin, Rilwan and Ben Bouall. npj Artificial Intelligence , volume =. doi:10.1038/s44387-026-00073-7 , urldate =
-
[31]
On the seasonal predictability of the 2020
Levin, Emma Lilly and Chien, Mu-Ting and Barnes, Elizabeth and He, Haozhe and Vecchi, Gabriel and Yang, Wenchang , year = 2026, month = mar, publisher =. On the seasonal predictability of the 2020. doi:10.31223/X5CN1R , urldate =
-
[32]
Examining Fast Radiatively Driven Responses Using Machine-Learning Weather Emulators
Mahesh, Ankur and Collins, William D. and O'Brien, Travis A. and Goddard, Paul B. and Zebaze, Sinclaire and Subramanian, Shashank and Duncan, James P. C. and. Examining. doi:10.48550/arXiv.2602.16090 , urldate =. arXiv , keywords =:2602.16090 , primaryclass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.16090
-
[33]
ai2cm/ace: 2026.5.1 , shorttitle =
McGibbon, Jeremy and. ai2cm/ace: 2026.5.1 , shorttitle =. doi:10.5281/zenodo.20128254 , urldate =
-
[34]
Meinshausen, M. and Raper, S. C. B. and Wigley, T. M. L. , year = 2011, month = feb, journal =. Emulating coupled atmosphere-ocean and carbon cycle models with a simpler model,. doi:10.5194/acp-11-1417-2011 , urldate =
-
[35]
Meinshausen, Malte and Nicholls, Zebedee R. J. and Lewis, Jared and Gidden, Matthew J. and Vogel, Elisabeth and Freund, Mandy and Beyerle, Urs and Gessner, Claudia and Nauels, Alexander and Bauer, Nico and Canadell, Josep G. and Daniel, John S. and John, Andrew and Krummel, Paul B. and Luderer, Gunnar and Meinshausen, Nicolai and Montzka, Stephen A. and R...
-
[36]
Climate sensitivity and relative humidity changes in global storm-resolving model simulations of climate change , author =. Science Advances , volume =. doi:10.1126/sciadv.adn5217 , urldate =
-
[37]
Nicholls, Zebedee R. J. and Meinshausen, Malte and Lewis, Jared and Gieseke, Robert and Dommenget, Dietmar and Dorheim, Kalyn and Fan, Chen-Shuo and Fuglestvedt, Jan S. and Gasser, Thomas and Gol. Reduced. Geoscientific Model Development , volume =. doi:10.5194/gmd-13-5175-2020 , urldate =
-
[38]
Andre and Kwa, Anna and McGibbon, Jeremy and Arcomano, Troy and Clark, Spencer K
Perkins, W. Andre and Kwa, Anna and McGibbon, Jeremy and Arcomano, Troy and Clark, Spencer K. and. doi:10.48550/arXiv.2512.18224 , urldate =. arXiv , keywords =:2512.18224 , primaryclass =
-
[39]
and Brath, Manfred and Crevoisier, Cyril and Jamil, Omar and Franklin Evans, K
Pincus, Robert and Buehler, Stefan A. and Brath, Manfred and Crevoisier, Cyril and Jamil, Omar and Franklin Evans, K. and Manners, James and Menzel, Raymond L. and Mlawer, Eli J. and Paynter, David and Pernak, Rick L. and Tellier, Yoann , year = 2020, month = nov, journal =. Benchmark. doi:10.1029/2020JD033483 , urldate =
-
[40]
Geophysical Research Letters , volume =
Linear additivity of climate response for combined albedo and greenhouse perturbations , author =. Geophysical Research Letters , volume =. doi:10.1029/97GL00248 , urldate =
-
[41]
Riahi, Keywan and. The. Global Environmental Change , volume =. doi:10.1016/j.gloenvcha.2016.05.009 , urldate =
-
[42]
Saha, Suranjana and Moorthi, Shrinivas and Wu, Xingren and Wang, Jiande and Nadiga, Sudhir and Tripp, Patrick and Behringer, David and Hou, Yu-Tai and Chuang, Hui-ya and Iredell, Mark and Ek, Michael and Meng, Jesse and Yang, Rongqian and Mendez, Malaqu. The. Journal of Climate , volume =. doi:10.1175/JCLI-D-12-00823.1 , urldate =
-
[43]
and Chapman, William and Gagne II, David John , year = 2025, journal =
Sha, Yingkai and Schreck, John S. and Chapman, William and Gagne II, David John , year = 2025, journal =. Improving. doi:10.1029/2025MS005138 , urldate =
-
[44]
Tebaldi, C. and Selin, N. E. and Ferrari, R. and Flierl, G. , year = 2025, month = oct, journal =. Emulators of. doi:10.1146/annurev-environ-012125-085838 , urldate =
-
[45]
Thi. A. Bulletin of the American Meteorological Society , volume =. doi:10.1175/BAMS-84-5-645 , urldate =
-
[46]
and Fasullo, John T
Trenberth, Kevin E. and Fasullo, John T. , year = 2018, month = aug, doi =. Applications of an
2018
-
[47]
arXiv , keywords =:2310.02074 , primaryclass =
doi:10.48550/arXiv.2310.02074 , urldate =. arXiv , keywords =:2310.02074 , primaryclass =
-
[48]
doi:10.1038/s41612-025-01090-0 , urldate =
npj Climate and Atmospheric Science , volume =. doi:10.1038/s41612-025-01090-0 , urldate =
-
[49]
Webb, Mark J. and Andrews, Timothy and. The. Geoscientific Model Development , volume =. doi:10.5194/gmd-10-359-2017 , urldate =
-
[50]
Zelinka, Mark D. and Klein, Stephen A. and Taylor, Karl E. and Andrews, Timothy and Webb, Mark J. and Gregory, Jonathan M. and Forster, Piers M. , year = 2013, month = jul, journal =. Contributions of. doi:10.1175/JCLI-D-12-00555.1 , urldate =
-
[51]
Zhang, Bosong and Merlis, Timothy M. , year = 2026, month = jan, eprint =. The. doi:10.48550/arXiv.2510.02415 , urldate =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2510.02415 2026
-
[52]
and Chen, Xi and Rees, Shannon L
Zhou, Linjiong and Lin, Shian-Jiann and Chen, Jan-Huey and Harris, Lucas M. and Chen, Xi and Rees, Shannon L. , year = 2019, month = jul, journal =. Toward. doi:10.1175/BAMS-D-17-0246.1 , urldate =
-
[53]
Jacob and Morin, Matthew , year = 2022, journal =
Zhou, Linjiong and Harris, Lucas and Chen, Jan-Huey and Gao, Kun and Guo, Huan and Xiang, Baoqiang and Tong, Mingjing and Huff, J. Jacob and Morin, Matthew , year = 2022, journal =. Improving. doi:10.1029/2021MS002971 , urldate =
-
[54]
Zhou, Linjiong and Harris, Lucas , year = 2022, journal =. Integrated. doi:10.1029/2022GL100519 , urldate =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.