arxiv: 2604.08772 · v1 · submitted 2026-04-09 · ⚛️ physics.ao-ph · cs.LG

Recognition: unknown

CERBERUS: A Three-Headed Decoder for Vertical Cloud Profiles

Emily K. deJong , Nipun Gunawardena , Kevin Smalley , Hassan Beydoun , Peter Caldwell

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:42 UTC · model grok-4.3

classification ⚛️ physics.ao-ph cs.LG

keywords vertical cloud profilesradar reflectivitygeostationary satelliteprobabilistic inferenceencoder-decoder networkzero-inflated distributionatmospheric remote sensing

0 comments

The pith

CERBERUS generates vertical radar reflectivity profiles from geostationary satellite data by predicting full probability distributions at each height.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CERBERUS as a way to create three-dimensional cloud information from the two-dimensional brightness temperatures that satellites routinely provide. It does this by training a neural network on paired ground radar and satellite observations so the model learns to output not single numbers but entire distributions of possible reflectivity values at every vertical level. A reader would care because weather and climate models need vertical cloud structure to represent processes like precipitation and radiation accurately, yet global data have long lacked it. The method recovers realistic cloud layers even in complicated multilayer cases and supplies uncertainty ranges that widen where the atmosphere is ambiguous. This points toward a practical route for creating synthetic vertical observations that models can use directly.

Core claim

CERBERUS is a probabilistic inference framework that takes geostationary satellite brightness temperatures, near-surface meteorological variables, and time context as input and produces a zero-inflated vertically-resolved distribution of radar reflectivity. Trained on Ka-band radar at one mid-latitude site, the model recovers coherent vertical structures across different cloud regimes, generalizes to held-out time periods, and yields uncertainty estimates that increase in multilayer and dynamically complex situations.

What carries the argument

Three-headed encoder-decoder architecture that outputs a zero-inflated distribution for radar reflectivity at each vertical level.

If this is right

The framework supplies vertical cloud information at the times and locations where only satellite views exist.
Uncertainty estimates widen precisely where physical ambiguity is high, such as overlapping cloud layers.
Distribution-valued outputs allow models to ingest probabilistic rather than deterministic cloud fields.
The same architecture can be retrained on other radar frequencies or surface networks once paired data become available.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the method scales, climate model evaluation could shift from comparing only column-integrated quantities to comparing full vertical profiles at global coverage.
The uncertainty maps could serve as weights in data assimilation schemes that blend satellite and model information.
Extending the input channels to include additional wavelengths or reanalysis variables would test whether the three-headed structure remains effective.

Load-bearing premise

Relationships learned from ground-based radar at a single mid-latitude site will transfer to satellite observations of clouds in all global regimes.

What would settle it

Direct comparison of CERBERUS-generated vertical profiles against independent ground or airborne radar measurements collected at tropical or polar sites during known multilayer or rapidly changing cloud events.

Figures

Figures reproduced from arXiv: 2604.08772 by Emily K. deJong, Hassan Beydoun, Kevin Smalley, Nipun Gunawardena, Peter Caldwell.

**Figure 2.** Figure 2: Statistics of per-sample RMSE (dBZ) for each cloud regime (see Table A.1) over the [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Test data and predicted reflectivity mean and uncertainty as a function of altitude, ranked [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Illustration of the time-evolving reflectivity on Jan 29, 2025 (test set): true KAZR data [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 2.** Figure 2: 11 [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

read the original abstract

Atmospheric clouds exhibit complex three-dimensional structure and microphysical details that are poorly constrained by the predominantly two-dimensional satellite observations available at global scales. This mismatch complicates data-driven learning and evaluation of cloud processes in weather and climate models, contributing to ongoing uncertainty in atmospheric physics. We introduce CERBERUS, a probabilistic inference framework for generating vertical radar reflectivity profiles from geostationary satellite brightness temperatures, near-surface meteorological variables, and temporal context. CERBERUS employs a three-headed encoder-decoder architecture to predict a zero-inflated (ZI) vertically-resolved distribution of radar reflectivity. Trained and evaluated using ground-based Ka-band radar observations at the ARM Southern Great Plains site, CERBERUS recovers coherent structures across cloud regimes, generalizes to withheld test periods, and provides uncertainty estimates that reflect physical ambiguity, particularly in multilayer and dynamically complex clouds. These results demonstrate the value of distribution-based learning targets for bridging observational scales, introducing a path toward model-relevant synthetic observations of clouds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces CERBERUS, a probabilistic three-headed encoder-decoder framework that predicts zero-inflated vertical distributions of radar reflectivity from geostationary satellite brightness temperatures, near-surface meteorological variables, and temporal context. Trained and evaluated exclusively on collocated Ka-band radar, satellite, and surface data from the ARM Southern Great Plains site, the model is claimed to recover coherent vertical cloud structures across regimes, generalize to withheld temporal test periods at the same site, and produce uncertainty estimates that reflect physical ambiguity in multilayer and complex clouds. This is presented as demonstrating the value of distribution-based targets for bridging 2D satellite to 3D cloud observations and opening a path to model-relevant synthetic observations at global scales.

Significance. If the core results on temporal generalization and uncertainty calibration at a single site hold under more rigorous quantitative scrutiny, the distribution-based learning target and three-headed architecture represent a useful technical contribution for handling the probabilistic nature of cloud vertical structure retrievals. The approach could aid in generating synthetic observations for model evaluation, but the single-site training and evaluation substantially limit its immediate significance for the claimed global-scale applications.

major comments (3)

Abstract: The claim that CERBERUS 'generalizes to withheld test periods' and opens 'a path toward model-relevant synthetic observations of clouds' at global scales is not supported by evidence, as all training and evaluation uses data from only the ARM Southern Great Plains mid-latitude site with no cross-site, cross-latitude, or cross-regime testing described.
Abstract and Results: No quantitative metrics (e.g., RMSE, CRPS, or calibration scores for the ZI distributions), error bars, ablation studies, or detailed validation protocol are supplied to substantiate claims of recovering coherent structures and physically meaningful uncertainty estimates; evaluation appears confined to qualitative assessment on one site.
Methods/Results: The assumption that relationships learned from ground-based Ka-band radar at a single continental mid-latitude site transfer to geostationary satellite observations across diverse global cloud regimes is untested, directly undercutting the central claim of broader applicability for synthetic observations.

minor comments (1)

Abstract: The title refers to a 'three-headed decoder' but the abstract does not briefly clarify the roles of the three heads or how they relate to the zero-inflated output; add one sentence for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback, which correctly identifies that several claims exceed the scope of the presented experiments. We agree that the abstract requires revision to avoid overstating generalization and applicability. We address each major comment below and will incorporate the necessary changes.

read point-by-point responses

Referee: Abstract: The claim that CERBERUS 'generalizes to withheld test periods' and opens 'a path toward model-relevant synthetic observations of clouds' at global scales is not supported by evidence, as all training and evaluation uses data from only the ARM Southern Great Plains mid-latitude site with no cross-site, cross-latitude, or cross-regime testing described.

Authors: The phrase 'generalizes to withheld test periods' is supported by the temporal hold-out experiments at the single ARM SGP site. However, we agree that references to global scales are not evidenced by the current results and overstate the work. We will revise the abstract to explicitly note the single-site training and evaluation, clarify that withheld periods are temporal only, and reframe global synthetic observations as a prospective direction rather than a demonstrated outcome. revision: yes
Referee: Abstract and Results: No quantitative metrics (e.g., RMSE, CRPS, or calibration scores for the ZI distributions), error bars, ablation studies, or detailed validation protocol are supplied to substantiate claims of recovering coherent structures and physically meaningful uncertainty estimates; evaluation appears confined to qualitative assessment on one site.

Authors: The current manuscript relies primarily on qualitative profile visualizations to illustrate coherent structures and uncertainty behavior. We will add quantitative evaluation in the revised version, including CRPS scores for the zero-inflated distributions, calibration diagnostics for the uncertainty estimates, error bars on key performance measures, ablation experiments on the three-headed architecture, and an expanded description of the temporal validation protocol. revision: yes
Referee: Methods/Results: The assumption that relationships learned from ground-based Ka-band radar at a single continental mid-latitude site transfer to geostationary satellite observations across diverse global cloud regimes is untested, directly undercutting the central claim of broader applicability for synthetic observations.

Authors: We acknowledge that transferability across regimes remains untested, as the radar targets are site-specific even though the satellite inputs are globally available. We will revise the discussion to state this limitation explicitly, emphasize that the work is a proof-of-concept demonstration at one mid-latitude site, and position multi-site validation as required future work for global-scale claims. The technical contribution of the zero-inflated three-headed decoder for capturing physical ambiguity stands independently of this scope limitation. revision: yes

Circularity Check

0 steps flagged

No circularity detected; standard supervised ML training and temporal hold-out evaluation

full rationale

The paper trains a three-headed encoder-decoder on collocated Ka-band radar targets, satellite brightness temperatures, and surface variables at a single site, then evaluates on temporally withheld periods at the same site. The claimed outputs (coherent vertical structures, uncertainty estimates) are learned mappings from the neural network, not quantities that reduce to the inputs by definition or by construction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing premises; the derivation chain is ordinary supervised learning against an external target dataset. Generalization claims are limited to the described test periods and do not rely on any tautological re-use of fitted quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no identifiable free parameters, axioms, or invented entities beyond standard neural network training assumptions. No specific model hyperparameters, loss terms, or physical constraints are detailed.

pith-pipeline@v0.9.0 · 5481 in / 1063 out tokens · 55170 ms · 2026-05-10T16:42:02.537335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 1 canonical work pages

[1]

Gneiting and A

URLhttp://arxiv.org/abs/2501.02035. arXiv:2501.02035 [cs]. Tilmann Gneiting and Adrian E Raftery. Strictly Proper Scoring Rules, Prediction, and Estima- tion.Journal of the American Statistical Association, 102(477):359–378, March 2007. ISSN 0162-1459. doi: 10.1198/016214506000001437. URLhttps://doi.org/10.1198/ 016214506000001437. eprint: https://doi.org...

work page doi:10.1198/016214506000001437 2007
[2]

Cloud thickness (from KAZR)>200m, removes very thin clouds
[3]

Both KAZR and GOES have valid (non-NaN) measurements at the SGP site (see Figure A.1). 4.|CT H GOES −CT H KAZR |< σ(|CT H GOES −CT H KAZR |)whereCT His cloud- top height andσis the standard deviation across the training/validation dataset; removes inconsistent scenes where GOES and KAZR may not be measuring the same cloud. KAZR Cloud KAZR Clear GOES Cloud...

2022
[4]

to make it directly comparable to the RMSE that is reported for the deterministic model. 9 ICLR 2026 Machine Learning for Remote Sensing (ML4RS) Workshop A.4 TRAININGCHARACTERISTICS 0 10 20 30 40 50 Epoch 0.0 0.1 0.2 0.3 0.4Negative Log Loss Training Validation Best Model Figure A.5: Training and validation loss for the presented ZIB configuration of CERB...

2026