arxiv: 2604.00551 · v2 · submitted 2026-04-01 · ⚛️ physics.geo-ph

Recognition: 2 theorem links

· Lean Theorem

Horizontal-Component Prior-based Framework for Adaptive Shear-wave Leakage Suppression in OBC Data

Zheng Cong , Shiqi Dong , Xintong Dong , Xunqian Tong

Authors on Pith no claims yet

Pith reviewed 2026-05-13 22:26 UTC · model grok-4.3

classification ⚛️ physics.geo-ph

keywords OBC seismic datashear-wave leakage suppressiondeep learning denoisinghorizontal component prioradaptive noise suppressionmulti-component seismiclabel-free training

0 comments

The pith

The HPAS framework generates input-label pairs from horizontal-component shear noise to train a model that suppresses leakage in the vertical OBC component without clean labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method to remove unwanted shear-wave noise from the vertical component of ocean-bottom seismic recordings. Traditional approaches need manual tuning or unrealistic clean data for training. This framework creates artificial training examples by taking noise patterns from the horizontal components, adjusting their statistics to match the leakage in the vertical, and using addition and subtraction to form noisy and clean pairs. The model then learns to subtract the leakage while keeping the primary waves intact. This matters because it allows effective denoising on real field data where perfect labels are unavailable.

Core claim

Instead of relying on clean primary-wave data, HPAS generates input-label pairs directly from raw multi-component field data using an additive-subtractive noise strategy. Specifically, shear-wave noise is extracted from the horizontal components and a linear transformation is applied to match its first and second order moments with the S-wave leakage in the Z-component. The statistically matched noise is then added to and subtracted from the original Z-component to create the input and label pairs. By allowing the denoising model to learn the S-wave features present in the differences between the input and the label, the adaptive denoising process approximates supervised learning.

What carries the argument

The additive-subtractive noise strategy that creates input-label pairs by extracting S-wave noise from horizontal components and applying a linear transformation to match first- and second-order moments with leakage in the Z-component.

If this is right

Suppresses S-wave leakage while preserving P-wave amplitudes in the Z-component on both synthetic and field data.
Approximates supervised learning performance without access to clean primary-wave labels.
Offers a robust solution with strong generalization capabilities for OBC data processing and imaging.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may reduce dependence on synthetic data generation for training denoising models in other multi-component seismic settings.
It could extend to noise suppression tasks where one component provides statistical priors for leakage in another.
Performance would likely vary if the linear moment-matching step fails to capture higher-order statistical differences between horizontal noise and vertical leakage.

Load-bearing premise

That a linear transformation matching only first- and second-order moments of horizontal-component shear-wave noise is sufficient to create valid input-label pairs that allow the model to learn the actual leakage present in the vertical component.

What would settle it

Apply the framework to synthetic OBC data with a known exact amount of added shear leakage in the Z-component and check whether the output matches the true clean P-wave signal more closely than baseline methods.

Figures

Figures reproduced from arXiv: 2604.00551 by Shiqi Dong, Xintong Dong, Xunqian Tong, Zheng Cong.

**Figure 1.** Figure 1: Horizontal-component priors-based framework for adaptive shear-wave leakage suppression. Denoising theory The original Z-component data can be decomposed into P waves and S-wave leakage: z = xn+ , (1) where z , x , and n represent noisy Z-component data, clean P-wave signal, and S-wave leakage noise, respectively. Monroy et al. (2025) proposed a noise prediction model that trains a neural network based on … view at source ↗

**Figure 2.** Figure 2: Denoising result of synthetic data with real S waves. (a) The noisy data. (b) The [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 4.** Figure 4: The comparison of the 150th shot of predictions. (a) The comparison of [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 6.** Figure 6: The F-K spectra of the S-wave leakage suppression based on the field OBC data. (a) The P-component data. (b) The Z-component data. (c) The denoised result of Zcomponent data using polarization filtering. (d) The denoised result of the Z-component data using the Radon transform. (e) The denoised result of the HPAS framework. (f) Separated S [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 8.** Figure 8: The F-K spectra of the S-wave leakage suppression based on the field OBC data from another survey line. (a) The P-component data. (b) The Z-component data. (c) The denoised result of Z-component data using polarization filtering. (d) The denoised result of the Z-component data using the Radon transform. (e) The denoised result of the HPAS framework. (f) Separated S waves using polarization filtering. (g) S… view at source ↗

read the original abstract

Shear-wave leakage in the vertical (Z) component of ocean-bottom cable (OBC) seismic data commonly results from the receiver tilt and poor seafloor coupling, introducing unwanted coherent noise that impacts the subsequent data processing and imaging. Traditional denoising methods are limited by manual parameter tuning and idealized model assumptions, while deep-learning (DL) approaches have shown significant potential in suppressing shear-wave leakage. However, supervised learning requires clean primary waves (P waves) as the label, which is generally impractical to obtain for field data. To address these challenges, we propose a framework based on horizontal-component priors for adaptive shear-wave leakage suppression (HPAS). Instead of relying on clean primary-wave (P-wave) data, HPAS generates input-label pairs directly from raw multi-component field data using an additive-subtractive noise strategy. Specifically, we extract shear-wave (S-wave) noise from the horizontal components and apply a linear transformation to match its first and second order moments with the S-wave leakage in the Z-component, and the statistically matched noise is then added to and subtracted from the original Z-component to create the input and label pairs. By allowing the denoising model to learn the S-wave features present in the differences between the input and the label, the adaptive denoising process approximates supervised learning. Evaluations on both synthetic and field data demonstrate that the proposed HPAS framework effectively and adaptively suppresses S-wave leakage while preserving the amplitude of the P-wave signals in the Z-component, offering a robust solution with strong generalization capabilities.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HPAS offers a practical route to supervised-style training on raw OBC field data by moment-matching horizontal S-noise to Z-leakage, but the linear proxy may not fully capture the actual noise operator.

read the letter

The core idea is straightforward and useful: pull S-wave energy from the horizontal components, apply a linear transform so its first and second moments match the leakage visible in the vertical trace, then add and subtract that matched noise from the raw Z data to create input-label pairs. The network is trained to recover the difference, which approximates supervised denoising without needing clean P-wave labels. That construction is the real novelty here, and it sidesteps the usual barrier in field seismic work where ground-truth clean data is unavailable. The abstract shows the method was tested on both synthetic and real OBC gathers, with claims of leakage suppression while keeping P-wave amplitudes intact. If the full results hold up with decent metrics and comparisons, this is a solid engineering step for people already processing ocean-bottom data. The soft spot is exactly the one flagged in the stress-test note. Matching only mean and variance through a linear map does not guarantee the same spectrum, spatial correlation, or phase relationship to the underlying signal. If those differ, the network learns to suppress a surrogate rather than the real leakage, and any amplitude-preservation claim becomes harder to trust. The paper would need to show that the matched noise reproduces the actual leakage statistics beyond the first two moments, perhaps with spectral plots or cross-validation on the synthetic cases. Without that, the generalization story stays partly untested. This is aimed at geophysicists doing OBC denoising who already use or are open to deep-learning tools. It is narrow but concrete, and the explicit pair-generation strategy makes it worth a referee's time even if revisions are needed on the validation side. I would send it to review.

Referee Report

2 major / 2 minor

Summary. The paper introduces the HPAS framework for suppressing shear-wave leakage in the vertical (Z) component of OBC seismic data. It extracts S-wave noise from horizontal components, applies a linear transformation to match only the first- and second-order moments to the leakage observed in Z, and forms input-label pairs via additive and subtractive perturbation of the raw Z traces. These pairs are used to train a denoising model in a supervised-style manner without requiring clean P-wave labels. The method is evaluated on both synthetic and field data, with the central claim that it adaptively suppresses S-wave leakage while preserving P-wave amplitudes and exhibits strong generalization.

Significance. If the central claim holds, the work provides a practical route to applying deep-learning denoising to field OBC data where clean labels are unavailable. By constructing training pairs directly from raw multi-component recordings using an explicit external strategy, the approach avoids the circularity that would arise from fitting to the target itself and could improve imaging quality in surveys affected by receiver tilt and coupling issues. The data-driven construction with moment-matched horizontal priors is a clear strength relative to purely model-based or manually tuned methods.

major comments (2)

[Method (pair generation)] Method section on pair generation: the linear transformation is defined to match only first- and second-order moments of the extracted horizontal S-noise to the Z-component leakage. No demonstration is given that this suffices to reproduce the actual leakage operator (spectrum, spatial correlation, or phase relationship to the underlying P-wave). If higher-order statistics or frequency-dependent coupling remain unmatched, the model learns to suppress a surrogate rather than the true noise, undermining the claim that amplitude preservation of P-waves is a demonstrated property rather than an untested side-effect.
[Evaluation] Evaluation section: the abstract and results claim effective suppression on synthetic and field data, yet no quantitative metrics (SNR improvement, RMS error, amplitude fidelity measures), error bars, or statistical comparisons against baselines are reported. Without these, the central claim of robustness and generalization remains only partially supported, as noted in the low-confidence soundness assessment.

minor comments (2)

The abstract states that the matched noise is 'added to and subtracted from the original Z-component' but does not specify the exact scaling or whether the operation preserves the underlying P-wave energy exactly; a short clarifying equation would remove ambiguity.
Notation for the linear transformation coefficients is introduced without an explicit equation number; adding one would improve traceability when the method is referenced in later sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major point below and will revise the manuscript to strengthen the presentation of the method and evaluation.

read point-by-point responses

Referee: Method section on pair generation: the linear transformation is defined to match only first- and second-order moments of the extracted horizontal S-noise to the Z-component leakage. No demonstration is given that this suffices to reproduce the actual leakage operator (spectrum, spatial correlation, or phase relationship to the underlying P-wave). If higher-order statistics or frequency-dependent coupling remain unmatched, the model learns to suppress a surrogate rather than the true noise, undermining the claim that amplitude preservation of P-waves is a demonstrated property rather than an untested side-effect.

Authors: We appreciate the referee pointing out the need for further justification of the moment-matching step. The linear transformation is intentionally limited to first- and second-order moments because the dominant leakage mechanism in OBC data is a scaled and offset version of horizontal S-wave energy arising from receiver tilt and coupling; matching mean and variance ensures the injected noise has comparable energy and offset to the observed leakage. The additive-subtractive pair construction then lets the network learn the actual noise waveform present in the difference between input and label. We acknowledge that higher-order statistics and frequency-dependent effects are not explicitly enforced. In the revised manuscript we will add a dedicated paragraph in the Method section explaining this design choice, together with supplementary figures comparing the power spectra and cross-correlations of the moment-matched noise versus the raw Z-component leakage on both synthetic and field examples. This addition will clarify the scope and limitations of the approach without changing the core algorithm. revision: partial
Referee: Evaluation section: the abstract and results claim effective suppression on synthetic and field data, yet no quantitative metrics (SNR improvement, RMS error, amplitude fidelity measures), error bars, or statistical comparisons against baselines are reported. Without these, the central claim of robustness and generalization remains only partially supported, as noted in the low-confidence soundness assessment.

Authors: We agree that quantitative metrics are necessary to substantiate the claims. The original manuscript emphasized visual comparisons and qualitative descriptions of P-wave preservation. In the revision we will add a new quantitative evaluation subsection that reports SNR improvement and RMS error on the synthetic test set, amplitude fidelity (correlation coefficient and relative amplitude error) on both synthetic and field data, and direct comparisons against conventional baselines (f-k filtering and curvelet thresholding). Error bars will be included for the synthetic experiments based on multiple noise realizations. These additions will provide the statistical support requested and allow a clearer assessment of generalization. revision: yes

Circularity Check

0 steps flagged

No circularity in pair-generation or training pipeline

full rationale

The paper constructs training pairs externally from raw multi-component field data: S-wave noise is extracted from horizontal components, a linear transform matches only its first- and second-order moments to the (unknown) leakage in Z, and additive/subtractive perturbations create input/label pairs. The model then learns to suppress features present in the constructed differences. No equation or step reduces the final denoised Z output to a direct algebraic function of the original Z trace by construction; the learned mapping depends on the trained network weights. No self-citations, uniqueness theorems, or fitted parameters are invoked as load-bearing premises. Synthetic and field evaluations provide external checks independent of the pair-generation procedure.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that horizontal-component shear-wave signatures can be linearly transformed to statistically represent vertical-component leakage, plus the implicit choice of which moments to match.

free parameters (1)

linear transformation coefficients
Coefficients chosen to match first- and second-order moments of horizontal noise to vertical leakage; these are data-dependent and not derived from first principles.

axioms (1)

domain assumption Shear-wave leakage in the Z-component can be adequately represented by a linear transformation of noise extracted from the horizontal components
Invoked directly in the description of the additive-subtractive strategy.

pith-pipeline@v0.9.0 · 5579 in / 1302 out tokens · 59083 ms · 2026-05-13T22:26:40.953435+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we extract shear-wave (S-wave) noise from the horizontal components and apply a linear transformation to match its first and second order moments with the S-wave leakage in the Z-component

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

Chen, L., Z

Caldwell J., 1999, Marine multicomponent seismology: The Leading Edge, 18(11), 1274–1282, doi: 10.1190/1.1438198. Chen, L., Z. Chen, B. Wu, and J. Gao, 2024, Self-supervised shear wave noise adaptive subtraction in ocean bottom node data: Applied Sciences, 14(8), 3488, doi: 10.3390/app14083488. Chen, Y., and E. Saygin, 2021, Seismic inversion by hybrid ma...

work page doi:10.1190/1.1438198 1999
[2]

Yuan, S., J. Liu, S. Wang, T. Wang, and P. Shi, 2018, Seismic waveform classification and first-break picking using convolution neural networks: IEEE Geoscience and Remote Sensing Letters, 15(2), 272–276, doi: 10.1109/LGRS.2017.2785834. Zhang, K., W. Zuo, Y. Chen, D. Meng, and L. Zhang, 2017, Beyond a Gaussian denoiser: Residual learning of deep CNN for i...

work page doi:10.1109/lgrs.2017.2785834 2018