arxiv: 2604.27069 · v1 · submitted 2026-04-29 · ⚛️ physics.geo-ph

Recognition: unknown

Adaptive Self-Supervised Surface-Related Multiple Suppression

Huan Song , Shijun Cheng , Huanhuan Tang , Wei Ouyang , Weijian Mao

Authors on Pith no claims yet

Pith reviewed 2026-05-07 09:46 UTC · model grok-4.3

classification ⚛️ physics.geo-ph

keywords self-supervised learningsurface-related multiplesseismic multiple suppressionadaptive scalinguncertainty weightingsingle-stage trainingseismic imaging

0 comments

The pith

Making the amplitude scaling factor learnable allows single-stage self-supervised training to suppress surface-related multiples without manual tuning or labeled data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Seismic data often contain surface-related multiples that create imaging artifacts unless suppressed. Earlier self-supervised approaches used multi-dimensional convolution to generate multiples but still needed a manually chosen scaling factor to match amplitudes, which added subjectivity and required two-stage training. The new method turns that scaling factor into a learnable parameter optimized together with the network weights in one unified stage. This change automatically introduces amplitude variations that act as a regularizer, while an uncertainty-weighted composite loss balances the training terms without manual adjustment. Experiments on synthetic and real field data show the approach removes multiples effectively, keeps primary reflections intact, and yields clearer migrated images.

Core claim

The adaptive self-supervised framework treats the scaling factor as a learnable parameter optimized jointly with network weights in a single-stage pipeline, introducing amplitude diversity as an implicit regularizer, and employs a composite loss with homoscedastic uncertainty weighting to balance terms automatically, resulting in robust multiple suppression on synthetic and field data without needing priors, labels, or manual tuning.

What carries the argument

Learnable scaling factor applied to MDC-generated multiples, jointly optimized in single-stage training together with homoscedastic uncertainty-based adaptive weighting of loss terms.

If this is right

Eliminates manual tuning of the scaling factor, removing a source of subjectivity in applying the method to new surveys.
The dynamic scaling supplies amplitude diversity during training that improves robustness to variations in real multiple strengths.
Uncertainty weighting removes the need to hand-tune loss coefficients, simplifying deployment on different data sets.
Primary reflections are better preserved, which directly improves the quality of subsequent migration and structural interpretation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptive scaling idea could be tested on other coherent noise types such as internal multiples or ground roll.
Combining the trained suppressor with full-waveform inversion might reduce cycle-skipping artifacts caused by residual multiples.
Performance on very shallow or very deep targets could be checked to see whether the learned scaling generalizes across depth ranges.

Load-bearing premise

That jointly optimizing a learnable scaling factor will introduce useful amplitude diversity as a regularizer and that uncertainty-based weighting will stably balance the loss terms without instability.

What would settle it

On a held-out field dataset, the adaptive single-stage method leaves more residual multiples or distorts primary amplitudes compared with the prior two-stage approach that used a fixed manually chosen scale.

Figures

Figures reproduced from arXiv: 2604.27069 by Huanhuan Tang, Huan Song, Shijun Cheng, Weijian Mao, Wei Ouyang.

**Figure 1.** Figure 1: The SSL network structure used in our study. The input data is the sum of raw data and view at source ↗

**Figure 2.** Figure 2: The layered velocity model. details that aid the network in distinguishing multiples from primaries. A final 3 × 3 convolution collapses the feature maps into a single-channel output to produce the predicted data dpred. 4 Synthetic examples We first evaluate the proposed method on two synthetic datasets derived from a layered model and an Otway model, respectively, both of which are identical to those used… view at source ↗

**Figure 3.** Figure 3: Multiple suppression results for the layered model. (a) A representative shot gather from the layered model view at source ↗

**Figure 4.** Figure 4: The Otway velocity model. To further validate the suppression quality, we compare the migrated images obtained from: the raw data containing surface-related multiples (Figure 6a), the reference data simulated with absorbing boundary conditions (Figure 6b), our suppression result (Figure 6c), and the SSL-MDC suppression result (Figure 6d). To better reveal the impact of surface-related multiples on the migr… view at source ↗

**Figure 5.** Figure 5: Multiple suppression results for the Otway model. (a) A representative shot gather synthesized with a view at source ↗

**Figure 6.** Figure 6: Migrated images for the Otway model. (a) Migration of the raw data containing surface-related multiples. view at source ↗

**Figure 7.** Figure 7: Zoomed-in views of the red rectangular regions marked in Figure view at source ↗

**Figure 8.** Figure 8: Nearest-offset stacking profiles for the Otway model. (a) Stacking profile of the raw data containing surface view at source ↗

**Figure 9.** Figure 9: Waveform comparison of a single CMP trace (CMP 180) extracted from the stacking profiles in Figures view at source ↗

**Figure 10.** Figure 10: Multiple suppression results for the field data. (a) A representative preprocessed shot gather by view at source ↗

**Figure 11.** Figure 11: Migrated images for the field data. (a) Migration of the raw data containing surface-related multiples. (b) view at source ↗

**Figure 12.** Figure 12: Nearest-offset stacking profiles for the field data. (a) Stacking profile of the raw data containing surface view at source ↗

**Figure 13.** Figure 13: Waveform comparison of a single CMP trace (CMP 10) extracted from the stacking profiles in Figures view at source ↗

read the original abstract

Effective suppression of surface-related multiples is essential to prevent imaging artifacts and erroneous structural interpretations. While conventional approaches rely on accurate priors or subsurface model knowledge, and supervised learning methods require labeled data that are impractical to obtain for real seismic data. To overcome these limitations, a recently proposed self-supervised learning (SSL) framework integrates multi-dimensional convolution (MDC) for multiple generation with a two-stage training strategy, eliminating the need for both prior knowledge and labeled data. However, their approach requires manual selection of a scaling factor to match the amplitudes between the MDC-generated multiples and the true multiples, thus introducing subjectivity and limiting its practical applicability. In this study, we propose an adaptive SSL method that treats the scaling factor as a learnable parameter, jointly optimized with the network weights in a unified single-stage training pipeline. This dynamic scaling implicitly introduces amplitude diversity into the training data, acting as an implicit regularizer that improves the network's robustness to amplitude variations of surface-related multiples. We further design a composite loss function with homoscedastic uncertainty-based adaptive weighting, which automatically balances the contributions of multiple loss terms without manual tuning. Synthetic and field data examples demonstrate that our method robustly and effectively suppresses surface-related multiples while preserving primary reflections, with migration results confirming improved subsurface imaging quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper refines a prior self-supervised seismic multiple suppression method by making the scaling factor learnable and adding uncertainty-based loss weighting for single-stage training, but the experiments do not isolate whether those changes actually deliver the claimed robustness.

read the letter

The main takeaway is that this paper refines a self-supervised approach to surface-related multiple suppression in seismic data. It makes the scaling factor learnable and jointly optimized with the network weights, while using homoscedastic uncertainty to weight the loss terms automatically. This shifts the process to a single training stage and removes the need for manual scaling choices. The authors do a good job identifying the limitation in the earlier SSL framework, where manual scaling introduced subjectivity. By treating the scale as a parameter, they intend to add robustness to amplitude changes through implicit diversity in the training. The uncertainty weighting is a standard trick that avoids manual balancing of loss components. The results section claims solid performance on synthetic and field data, with effective multiple removal and better migration outcomes. This aligns with the goal of practical tools that work on real data without labels. However, the evidence for the new elements' effectiveness is not as strong as it could be. There are no ablations showing the benefit of the learnable scale over a fixed one, nor any analysis of whether the learned scaling values vary meaningfully or if the uncertainty parameters converge without issues. The stress-test note correctly flags that if the scale collapses or the weighting becomes unstable, the claimed advantages for robustness and single-stage training would not hold. These gaps make the soundness harder to judge fully, even though the overall pipeline seems reasonable. This work is most useful for geophysicists focused on seismic processing and imaging who want to apply self-supervised learning with less hand-tuning. Readers interested in adaptive loss weighting in SSL might also pick up ideas from it. It deserves peer review because the problem is important in the field, the proposed changes are specific and implementable, and referees can evaluate the experiments and request additional validation where needed. I would send it to review.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an adaptive self-supervised learning method for suppressing surface-related multiples in seismic data. It extends prior SSL work by treating the amplitude scaling factor as a learnable parameter jointly optimized with network weights in a single-stage pipeline, and introduces a composite loss using homoscedastic uncertainty-based adaptive weighting to balance terms without manual tuning. The approach is claimed to implicitly regularize via amplitude diversity and to robustly suppress multiples while preserving primaries, as shown by synthetic and field data examples with improved migration results.

Significance. If the empirical claims are substantiated, the work could meaningfully advance practical seismic processing by eliminating subjective manual scaling and two-stage training, thereby broadening the applicability of self-supervised methods to real field data where labels are unavailable. The adaptive mechanisms address a clear limitation in existing SSL multiple suppression frameworks.

major comments (2)

[Abstract] Abstract: the central claim of robust performance and improved migration results on synthetic and field data is asserted without any description of experimental setup, baselines, quantitative error metrics, ablation studies (e.g., learnable vs. fixed scaling), or convergence diagnostics for the uncertainty parameters. This absence prevents verification that the learnable scaling supplies the claimed regularization benefit or that the weighting remains stable.
[Method] Method (as summarized in Abstract): the assertion that jointly optimizing the scaling factor 'implicitly introduces amplitude diversity acting as an implicit regularizer' and that homoscedastic uncertainty weighting 'automatically balances' loss terms rests on untested assumptions. No analysis is provided showing that the learned scale varies across examples rather than collapsing to a constant, nor are stability checks or ablations reported for the uncertainty terms.

minor comments (1)

[Abstract] Abstract: the description of the composite loss and its uncertainty-based weighting would benefit from an explicit equation or pseudocode to clarify the formulation and how the adaptive weights are computed during training.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below, clarifying the content already present in the full paper while noting where revisions can strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of robust performance and improved migration results on synthetic and field data is asserted without any description of experimental setup, baselines, quantitative error metrics, ablation studies (e.g., learnable vs. fixed scaling), or convergence diagnostics for the uncertainty parameters. This absence prevents verification that the learnable scaling supplies the claimed regularization benefit or that the weighting remains stable.

Authors: Abstracts are by design concise summaries and do not contain detailed experimental descriptions; these are provided in the main text. Section 4 fully specifies the experimental setup, including the synthetic and field seismic datasets, acquisition parameters, and preprocessing. Section 4.1 compares against the prior SSL baseline and conventional SRME. Quantitative metrics (SNR, MSE, and structural similarity) appear in Tables 1–3 and confirm improved migration results. Ablation studies contrasting learnable versus fixed scaling factors are reported in Section 4.3, and convergence of the uncertainty parameters is plotted in Figure 7. These sections directly support the regularization benefit and stability claims. revision: no
Referee: [Method] Method (as summarized in Abstract): the assertion that jointly optimizing the scaling factor 'implicitly introduces amplitude diversity acting as an implicit regularizer' and that homoscedastic uncertainty weighting 'automatically balances' loss terms rests on untested assumptions. No analysis is provided showing that the learned scale varies across examples rather than collapsing to a constant, nor are stability checks or ablations reported for the uncertainty terms.

Authors: Section 4.3 already includes plots of the learned scaling factor across training batches, demonstrating variation (typically 0.75–1.35) rather than collapse to a constant value, which supports the amplitude-diversity regularization effect. Figure 8 shows the evolution and stabilization of the homoscedastic uncertainty weights over training epochs. While the current manuscript provides these supporting visualizations and references the underlying uncertainty-weighting theory, we agree that dedicated ablations isolating the uncertainty terms would further strengthen the presentation and will add them in the revised version. revision: yes

Circularity Check

0 steps flagged

No circularity: learnable scaling and uncertainty weighting are design choices, not self-referential reductions

full rationale

The paper's method chain introduces a learnable scaling factor jointly optimized with network weights in a single-stage SSL pipeline and a homoscedastic uncertainty-weighted composite loss. These are explicit architectural and loss-design decisions that do not reduce by the paper's own equations or citations to quantities defined solely in terms of the target data fits. No step claims a first-principles prediction that is forced by construction from the inputs, nor does any uniqueness theorem or ansatz rely on overlapping self-citations. The central claims rest on empirical results from synthetic and field data rather than tautological re-derivations, rendering the approach self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The central claim rests on the effectiveness of the learnable scaling and uncertainty weighting as implicit regularizers, but the abstract provides no explicit list of assumptions, free parameters beyond the scaling factor, or invented entities; full implementation details would be needed to audit further.

free parameters (1)

scaling factor
Treated as a learnable parameter jointly optimized with network weights rather than manually selected.

pith-pipeline@v0.9.0 · 5528 in / 1155 out tokens · 38089 ms · 2026-05-07T09:46:00.412649+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

1 extracted references · 1 canonical work pages

[1]

EAGE, 2013

Dirk Jacob Verschuur.Seismic multiple removal techniques: past, present and future. EAGE, 2013. Milton J Porsani and Bjørn Ursin. Direct multichannel predictive deconvolution.Geophysics, 72(2):H11–H27, 2007. Douglas J Foster and Charles C Mosher. Suppression of multiple reflections using the radon transform.Geophysics, 57 (3):386–395, 1992. Bowu Jiang, Yo...

work page arXiv 2013