arxiv: 2604.17235 · v1 · submitted 2026-04-19 · ⚛️ physics.geo-ph

Recognition: unknown

Massive-scale unlabeled field and labeled synthetic seismic datasets of global shelf-edge clinothems

Hui Gao, Jiarun Yang, Jintao Li, Xiaoming Sun, Xinming Wu

Pith reviewed 2026-05-10 06:05 UTC · model grok-4.3

classification ⚛️ physics.geo-ph

keywords seismic datasetshelf-edge clinothemsdeep learningseismic interpretationsynthetic databenchmark datasetfield data

0 comments

The pith

A hybrid dataset of 3000 unlabeled field and 4000 labeled synthetic seismic sections is released to support deep learning for automated shelf-edge clinothem interpretation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tackles the data shortage that has prevented AI methods from automating seismic stratigraphic interpretation of shelf-edge clinothems, structures whose analysis reveals tectonic history, climate shifts, and hydrocarbon systems. The authors compile 3000 real but unlabeled field seismic sections from global locations and generate 4000 labeled synthetic sections through geological and geophysical forward modeling that aims to reproduce real structural complexity. Several baseline deep learning models trained and evaluated on the combined collection produce accurate results. The work concludes that the resulting hybrid dataset supplies an effective and representative foundation for model training, quantitative testing, and downstream applications in basin analysis.

Core claim

A hybrid benchmark dataset is produced by curating 3000 unlabeled field seismic sections and generating 4000 labeled synthetic seismic sections of global shelf-edge clinothems via geological and geophysical forward modeling; baseline deep learning models achieve accurate performance on the collection, establishing it as an effective basis for training, assessment, and practical use in automated seismic stratigraphic interpretation.

What carries the argument

The hybrid benchmark dataset that pairs curated unlabeled field seismic sections with forward-modeled labeled synthetic sections of shelf-edge clinothems.

Load-bearing premise

The synthetic seismic data produced by geological and geophysical forward modeling sufficiently captures the structural complexity, variability, and labeling accuracy of real clinothem systems.

What would settle it

Deep learning models trained only on the synthetic portion of the dataset would show markedly lower accuracy when tested on a large collection of independent real field seismic sections not used in dataset creation.

Figures

Figures reproduced from arXiv: 2604.17235 by Hui Gao, Jiarun Yang, Jintao Li, Xiaoming Sun, Xinming Wu.

**Figure 2.** Figure 2: Spatial distribution and representative patterns of field seismic clinothem [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗

**Figure 3.** Figure 3: Geological and geophysical forward modeling workflow for constructing [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of RGT predicted results across different baseline models (GLP, [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

read the original abstract

Seismic stratigraphic interpretation of shelf-edge clinothems is essential for revealing tectonic evolution, paleoclimate change, depositional dynamic conditions, and hydrocarbon generation and accumulation during basin filling. However, traditional interpretation methods remain labor-intensive, time-consuming, and highly subjective. Although AI-based method offer a potential solution for automated this task, its development has been limited by the scarcity of comprehensive and representative benchmark datasets for shelf-edge clinothems. This limitation primarily arises from limited field data availability, the scarcity of reliable geological labels, and the structural complexity and strong variability of clinothem-dominated systems. To address this gap, we develop a hybrid benchmark dataset through two complementary strategies of field data curation and geological and geophysical forward modeling, ultimately generating 3,000 unlabeled field and 4,000 labeled synthetic seismic data, respectively. We further evaluate several representative baseline deep learning models on these datasets, and the accurate results demonstrate that the curated dataset provides an effective and representative basis for model training, quantitative assessment, and practical application. Finally, we have publicly released this hybrid benchmark dataset (https://doi.org/10.5281/zenodo.18910271) to facilitate the development, validation, and assessment of deep learning methods for automated seismic stratigraphic interpretation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main deliverable is a public hybrid dataset of real and synthetic seismic volumes for clinothems, but the claim that it forms an effective training basis rests on thin validation.

read the letter

The new thing here is the scale and hybrid construction: 3000 unlabeled field volumes paired with 4000 labeled synthetics generated by geological and geophysical forward modeling, all released on Zenodo for shelf-edge clinothem interpretation. That directly tackles the labeled-data shortage in this narrow but important corner of seismic stratigraphy, and the public release is a concrete step that others can use right away for training or benchmarking AI models in hydrocarbon or paleoclimate work. The baseline experiments are at least an attempt to show the data can be used, which is better than a pure data-dump paper. The soft spot is the representativeness of the synthetics. The abstract says the models gave accurate results, yet supplies no metrics, no description of label assignment, and no checks that the forward models reproduce the structural variability, noise, or interpretive ambiguity seen in the real volumes. If the synthetics are too clean or too limited in their parameter space, the good baseline numbers could be an artifact rather than evidence of real-world utility. That matches the stress-test concern and is the load-bearing assumption for anyone wanting to treat this as a practical benchmark. The paper is aimed at geophysicists and ML practitioners who need domain-specific seismic data; a reader already working on automated stratigraphic interpretation will find the release useful even if they have to do their own validation. It is worth sending to peer review because the dataset itself is new and the effort is substantial, though referees should press for quantitative distributional comparisons between synthetic and field data and for clearer labeling protocols.

Referee Report

3 major / 2 minor

Summary. The paper presents a hybrid benchmark dataset for AI-based seismic stratigraphic interpretation of shelf-edge clinothems, consisting of 3,000 unlabeled field seismic volumes curated from real data and 4,000 labeled synthetic volumes generated via geological and geophysical forward modeling. Baseline deep learning models are evaluated on the datasets, with the authors stating that the models produced accurate results demonstrating the dataset's effectiveness and representativeness for training, assessment, and practical use; the data is released publicly via Zenodo.

Significance. If the synthetic volumes faithfully reproduce the structural complexity, noise characteristics, and interpretive ambiguity of real clinothem systems, and if the baseline evaluations are rigorously quantified, this large-scale hybrid dataset would fill a notable gap in labeled seismic data for machine learning applications in geophysics. It could accelerate development of automated interpretation tools with direct relevance to tectonic, paleoclimate, and hydrocarbon studies.

major comments (3)

[Abstract and results] Abstract and evaluation section: The claim that 'baseline models gave accurate results' and that this 'demonstrate[s] that the curated dataset provides an effective and representative basis' is unsupported by any quantitative metrics (e.g., accuracy, IoU, Dice coefficient, or confusion matrices), model architectures, training protocols, or validation splits. This omission prevents assessment of whether the results reflect genuine generalization or simulation artifacts.
[Methods (synthetic generation)] Synthetic data generation section: No parameter ranges, stochastic perturbation details, noise models, or quantitative distributional comparisons (e.g., amplitude histograms, structural feature statistics, or Kolmogorov-Smirnov tests) are provided to show that the 4,000 forward-modeled volumes reproduce the variability and complexity of the 3,000 field volumes. Without this, the representativeness assumption remains untested.
[Methods (labeling)] Labeling and quality assurance: The process for assigning geological labels to the synthetic volumes, including any expert validation, inter-annotator agreement, or checks for labeling fidelity against real-world interpretive ambiguity, is not described. This directly affects the reliability of supervised training and quantitative assessment claims.

minor comments (2)

[Abstract] Abstract contains a grammatical error: 'offer a potential solution for automated this task' should read 'offer a potential solution for automating this task'.
[Figures] The manuscript would benefit from side-by-side figures or slices comparing synthetic and field data to visually illustrate similarity in structural features and noise characteristics.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review of our manuscript. The comments highlight important areas where additional rigor and transparency are needed to strengthen the presentation of our hybrid benchmark dataset. We address each major comment below and commit to revisions that will incorporate quantitative details, methodological clarifications, and expanded descriptions without altering the core contributions of the work.

read point-by-point responses

Referee: [Abstract and results] Abstract and evaluation section: The claim that 'baseline models gave accurate results' and that this 'demonstrate[s] that the curated dataset provides an effective and representative basis' is unsupported by any quantitative metrics (e.g., accuracy, IoU, Dice coefficient, or confusion matrices), model architectures, training protocols, or validation splits. This omission prevents assessment of whether the results reflect genuine generalization or simulation artifacts.

Authors: We agree that the current manuscript does not provide the quantitative metrics, model details, or validation information necessary to fully substantiate the claims in the abstract and evaluation section. The phrase 'accurate results' was based on internal qualitative assessments and visual comparisons performed during dataset development. In the revised manuscript, we will qualify or remove this phrasing from the abstract and add a dedicated evaluation subsection. This will include descriptions of the baseline model architectures, training protocols, validation splits, and quantitative performance metrics such as accuracy, IoU, Dice coefficient, and confusion matrices, along with discussion of potential artifacts versus generalization. revision: yes
Referee: [Methods (synthetic generation)] Synthetic data generation section: No parameter ranges, stochastic perturbation details, noise models, or quantitative distributional comparisons (e.g., amplitude histograms, structural feature statistics, or Kolmogorov-Smirnov tests) are provided to show that the 4,000 forward-modeled volumes reproduce the variability and complexity of the 3,000 field volumes. Without this, the representativeness assumption remains untested.

Authors: The synthetic volumes were generated using geological and geophysical forward modeling informed by published characteristics of global shelf-edge clinothems, including variations in depositional parameters and realistic noise incorporation. However, we acknowledge that the methods section lacks explicit parameter ranges, perturbation details, noise models, and quantitative comparisons. In the revision, we will add these elements, including tables of parameter ranges, descriptions of stochastic perturbations and noise models, and distributional comparisons such as amplitude histograms, structural feature statistics, and Kolmogorov-Smirnov tests to demonstrate how the synthetics capture the variability of the field data. revision: yes
Referee: [Methods (labeling)] Labeling and quality assurance: The process for assigning geological labels to the synthetic volumes, including any expert validation, inter-annotator agreement, or checks for labeling fidelity against real-world interpretive ambiguity, is not described. This directly affects the reliability of supervised training and quantitative assessment claims.

Authors: Labels for the synthetic volumes are derived directly from the known geological models used in forward modeling, providing deterministic correspondence to features such as clinothem geometries. The authors, with expertise in seismic stratigraphy, conducted internal validation for fidelity. We did not include inter-annotator agreement metrics because labeling is model-driven rather than subjective annotation. We agree that more detail is required for transparency. In the revised methods section, we will fully describe the labeling process, including derivation from geological models, expert validation steps, and any checks for alignment with interpretive ambiguity in real clinothem systems. revision: yes

Circularity Check

0 steps flagged

No circularity: direct data curation and release with standard validation

full rationale

The paper describes curation of 3,000 unlabeled field seismic volumes and generation of 4,000 labeled synthetic volumes via geological and geophysical forward modeling, followed by baseline deep-learning evaluations whose 'accurate results' support the claim of an effective benchmark. No equations, fitted parameters, predictions, or derivations appear in the abstract or described content. The evaluation step is a direct test on the released data rather than a reduction to inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing premises. The contribution is self-contained data generation and public release (Zenodo DOI), with no load-bearing step that collapses to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This work is a data curation and synthetic generation effort rather than a theoretical paper; no free parameters, axioms, or new entities are introduced in a mathematical sense. Assumptions about geological modeling fidelity are implicit but not formalized here.

pith-pipeline@v0.9.0 · 5533 in / 1130 out tokens · 63798 ms · 2026-05-10T06:05:46.515882+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 2 canonical work pages

[1]

Earth-Science Reviews , volume=

Clinoforms and clinoform systems: Review and dynamic classification scheme for shorelines, subaqueous deltas, shelf edges and continental margins , author=. Earth-Science Reviews , volume=. 2018 , publisher=

2018
[2]

2013 , publisher=

Basin analysis: Principles and application to petroleum play assessment , author=. 2013 , publisher=

2013
[3]

SEG International Exposition and Annual Meeting , pages=

Unwrapping instantaneous phase to generate a relative geologic time volume , author=. SEG International Exposition and Annual Meeting , pages=. 2003 , organization=

2003
[4]

The Leading Edge , volume=

Relative geologic time (age) volumes—Relating every seismic sample to a geologically reasonable horizon , author=. The Leading Edge , volume=. 2004 , publisher=

2004
[5]

IEEE Access , volume=

STanford EArthquake Dataset (STEAD): A global data set of seismic signals for AI , author=. IEEE Access , volume=. 2019 , publisher=

2019
[6]

Advances in Neural Information Processing Systems , volume=

OpenFWI: Large-scale multi-structural benchmark datasets for full waveform inversion , author=. Advances in Neural Information Processing Systems , volume=
[7]

Earth System Science Data , volume=

cigFacies: a massive-scale benchmark dataset of seismic facies and its application , author=. Earth System Science Data , volume=. 2025 , publisher=

2025
[8]

Earth System Science Data , volume=

cigChannel: A massive-scale 3D seismic dataset with labeled paleochannels for advancing deep learning in seismic interpretation , author=. Earth System Science Data , volume=. 2024 , publisher=

2024
[9]

Earth System Science Data Discussions , volume=

OpenSWI: a massive-scale benchmark dataset for surface wave dispersion curve inversion , author=. Earth System Science Data Discussions , volume=. 2025 , publisher=

2025
[10]

Journal of Geophysical Research: Machine Learning and Computation , volume=

GeoFWI: A large velocity model data set for benchmarking full waveform inversion using deep learning , author=. Journal of Geophysical Research: Machine Learning and Computation , volume=. 2026 , publisher=

2026
[11]

Proceedings of the National Academy of Sciences , volume=

Sensing prior constraints in deep neural networks for solving exploration geophysical problems , author=. Proceedings of the National Academy of Sciences , volume=. 2023 , publisher=

2023
[12]

Geology , year=

A novel data-knowledge dual-driven model coupling artificial intelligence with a mineral systems approach for mineral prospectivity mapping , author=. Geology , year=
[13]

Geophysics , volume=

FaultSeg3D: Using synthetic data sets to train an end-to-end convolutional neural network for 3D seismic fault segmentation , author=. Geophysics , volume=. 2019 , publisher=

2019
[14]

85th EAGE Annual Conference & Exhibition , volume=

Synthetic data in geosciences: challenges and opportunities , author=. 85th EAGE Annual Conference & Exhibition , volume=. 2024 , organization=

2024
[15]

PloS one , volume=

pyBadlands: A framework to simulate sediment transport, landscape dynamics and basin stratigraphic evolution through space and time , author=. PloS one , volume=. 2018 , publisher=

2018
[16]

Journal of Open Source Software , volume=

goSPL: global scalable paleo landscape evolution , author=. Journal of Open Source Software , volume=
[17]

Geophysics , volume=

Building realistic structure models to train convolutional neural networks for seismic structural interpretation , author=. Geophysics , volume=. 2020 , publisher=

2020
[18]

The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 , volume=

Multiscale structural similarity for image quality assessment , author=. The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 , volume=. 2003 , organization=

2003
[19]

IEEE transactions on image processing , volume=

Image quality assessment: from error visibility to structural similarity , author=. IEEE transactions on image processing , volume=. 2004 , publisher=

2004
[20]

Basin Research , volume=

The quantifiable clinothem--types, shapes and geometric relationships in the Plio-Pleistocene giant foresets formation, Taranaki basin, New Zealand , author=. Basin Research , volume=
[21]

Geophysics , volume=

CIGVis: An open-source Python tool for the real-time interactive visualization of multidimensional geophysical data , author=. Geophysics , volume=. 2025 , publisher=

2025
[22]

doi:10.5281/zenodo.18910271 , url =

Gao, Hui and Wu, Xinming and Li, Jintao and Sun, Xiaoming and Yang, Jiarun , title =. doi:10.5281/zenodo.18910271 , url =

work page doi:10.5281/zenodo.18910271
[23]

Geoscientific Model Development , volume=

ClinoformNet-1.0: stratigraphic forward modeling and deep learning for seismic clinoform delineation , author=. Geoscientific Model Development , volume=. 2023 , publisher=

2023
[24]

IEEE Transactions on Geoscience and Remote Sensing , year=

A Geologically-Informed and Data-Driven AI Workflow for Fully Seismic Stratigraphic Interpretation of Sedimentary Basin , author=. IEEE Transactions on Geoscience and Remote Sensing , year=
[25]

International Conference on Medical image computing and computer-assisted intervention , pages=

U-net: Convolutional networks for biomedical image segmentation , author=. International Conference on Medical image computing and computer-assisted intervention , pages=. 2015 , organization=

2015
[26]

IEEE transactions on pattern analysis and machine intelligence , volume=

Deep high-resolution representation learning for visual recognition , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2020 , publisher=

2020
[27]

European Conference on Computer Vision , pages=

Localbins: Improving depth estimation by learning local distributions , author=. European Conference on Computer Vision , pages=. 2022 , organization=

2022
[28]

arXiv preprint arXiv:2201.07436 , year=

Global-local path networks for monocular depth estimation with vertical cutdepth , author=. arXiv preprint arXiv:2201.07436 , year=

work page arXiv