arxiv: 2605.11176 · v1 · submitted 2026-05-11 · 🌌 astro-ph.IM · cs.LG· hep-ex

Recognition: no theorem link

From raw data to neutrino candidates: a neural-network pipeline for Baikal-GVD

A. Matseiko (1 , 2) , G. Plotnikov (1) , I. Kharuk (1 , 2) ((1) Institute for Nuclear Research of the Russian Academy of Sciences , (2) Moscow Institute of Physics , Technology)

Authors on Pith no claims yet

Pith reviewed 2026-05-13 00:54 UTC · model grok-4.3

classification 🌌 astro-ph.IM cs.LGhep-ex

keywords Baikal-GVDneutrino telescopeneural networkstransformerdata processingdomain adaptationevent selectionneutrino candidates

0 comments

The pith

A neural network pipeline using transformers selects neutrino candidates from Baikal-GVD raw data much faster than conventional methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors describe a three-stage pipeline of transformer-based neural networks for processing data from the Baikal-GVD neutrino detector. The stages rapidly reject air shower events, suppress noise in optical modules, and identify high-confidence neutrino candidates. This sequential application provides a speedup of orders of magnitude compared to the standard reconstruction. The noise suppression network also outperforms traditional algorithms and estimates time residuals for signal hits. Domain adaptation allows the networks trained on simulations to work on real experimental data, enabling near-real-time classification useful for alerts and flux studies.

Core claim

A pipeline of three transformer networks exploiting inter-hit correlations through attention suppresses extensive air shower events, suppresses noise optical module activations, and extracts high-confidence neutrino candidates. Applied sequentially, it achieves orders-of-magnitude speedup over the standard reconstruction chain. The noise suppression network surpasses algorithmic accuracy and provides estimates for time residuals of signal hits, which aid track-like hit identification. Domain adaptation bridges Monte Carlo simulations and experimental data to improve agreement between the domains.

What carries the argument

A sequence of three transformer networks that use attention mechanisms to exploit correlations between detector hits, combined with domain adaptation to transfer models from simulations to real data.

If this is right

Sequential application of the three networks yields orders-of-magnitude speedup in event processing.
The noise suppression network provides time residual estimates that are crucial for identification of track-like hits.
Near-real-time event classification becomes feasible for multi-messenger alert systems.
Improved accuracy supports measurements of the diffuse neutrino flux.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The pipeline approach could be retrained for other large water Cherenkov neutrino detectors with different geometries.
Faster processing could enable online triggering and integration with external observatories for prompt multi-messenger follow-up.
Time residual estimates from the noise network might feed into improved full event reconstruction beyond candidate selection.

Load-bearing premise

The domain adaptation technique successfully bridges the distribution shift between Monte Carlo simulations and experimental data.

What would settle it

Running the full pipeline on a sample of real Baikal-GVD events and comparing the selected neutrino candidates and their time residual estimates directly against the output of the standard algorithmic reconstruction chain on the same events.

read the original abstract

We present a neural-network-based data processing pipeline for Baikal-GVD, designed to improve event reconstruction quality and accelerate neutrino candidates selection. The pipeline comprises three stages: fast suppression of extensive air shower events, suppression of noise optical modules activations, and extraction of high confidence neutrino candidates. All three networks employ a transformer architecture that exploits inter-hit correlations through the attention mechanism. Applied sequentially, the pipeline achieves orders-of-magnitude speedup over the standard reconstruction chain. Moreover, noise suppression neural network surpasses the accuracy of algorithmic noise suppression algorithms and provides estimate for time residuals of the signal hits, which is crucial for identification of track-like hits. We address the domain shift between Monte Carlo simulations and experimental data by incorporating a domain adaptation technique, demonstrating improved agreement between the two domains. The resulting framework enables near-real-time event classification, with direct applications to multi-messenger alert systems and diffuse neutrino flux measurements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper outlines a three-stage transformer pipeline for Baikal-GVD data processing with domain adaptation, but the speedup and accuracy claims lack the quantitative real-data validation needed to assess them.

read the letter

The main point is a concrete three-stage transformer pipeline for Baikal-GVD that first cuts air-shower events, then suppresses noise hits on optical modules, and finally selects high-confidence neutrino candidates. They train on Monte Carlo and add domain adaptation to reduce the mismatch with real detector data, aiming for near-real-time selection that could support multi-messenger alerts and flux studies. The noise-suppression stage also outputs time-residual estimates for signal hits, which helps with track identification later on. Applied end-to-end, they report orders-of-magnitude faster processing than the standard chain and better noise rejection than conventional algorithms. That combination of architecture and task sequencing is the new element here relative to prior work on this detector. The practical framing around operational speed and alert systems is a strength; it directly targets a bottleneck that matters for running the experiment. The transformer attention on inter-hit correlations fits the sparse, timing-driven nature of the data. The soft spot is the domain adaptation. The abstract states that it produces improved agreement between simulation and data, yet supplies no shift metric, no description of the adaptation method, and no performance comparison on held-out real events versus simulation-only training. Without those numbers or plots, the accuracy and speedup claims cannot be checked for actual Baikal-GVD data, especially at lower energies or for track-like events. This is the exact gap the stress-test note flags, and the available text does not close it. If the full manuscript contains ablation studies, real-data validation sets, and error bars, the picture changes. As it stands, the work is more a methods sketch than a demonstrated result. This is for people running or analyzing data from Baikal-GVD and similar neutrino telescopes who are already considering ML replacements for parts of the reconstruction chain. A reader in that group could extract the pipeline layout and the time-residual output idea even if they need to redo the validation themselves. I would send it to peer review. The problem is real, the detector context is specific, and the architecture is straightforward enough that referees can evaluate the missing checks and ask for them. It is not desk-reject material, but it needs the quantitative grounding before publication.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a three-stage transformer-based neural network pipeline for Baikal-GVD raw data processing: (1) fast suppression of extensive air shower events, (2) noise suppression in optical modules with time-residual estimation for track-like hits, and (3) extraction of high-confidence neutrino candidates. All stages exploit inter-hit correlations via attention; the pipeline incorporates domain adaptation to address Monte Carlo vs. experimental data shift and is claimed to deliver orders-of-magnitude speedup over the standard reconstruction chain while improving accuracy.

Significance. If the reported speedups and accuracy gains are quantitatively validated, the work would enable near-real-time neutrino candidate selection, directly supporting multi-messenger alert systems and diffuse flux analyses with large underwater Cherenkov arrays. The attention-based modeling of hit correlations is a timely application of modern sequence models to this instrumentation domain.

major comments (2)

[Abstract] Abstract: the central performance claims (orders-of-magnitude speedup, noise-suppression accuracy surpassing algorithmic baselines, and provision of time-residual estimates) are stated without any numerical values, tables, figures, error bars, or ablation results. These metrics are load-bearing for the pipeline's claimed utility.
[Abstract] Abstract and domain-adaptation description: the statement that domain adaptation yields 'improved agreement between the two domains' provides neither the adaptation method (adversarial, MMD, etc.), a quantitative shift metric, nor performance deltas on held-out real Baikal-GVD data versus MC. This directly affects transfer of the speedup and accuracy claims to experimental events.

minor comments (2)

[Methods] The input feature representation fed to the transformer (hit coordinates, charges, times) should be explicitly listed or referenced to a table for reproducibility.
[Noise suppression stage] Clarify whether the reported time-residual estimates are per-hit or per-event and how they are used downstream for track identification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of our work's potential significance for real-time neutrino selection and for the detailed comments on the abstract. We address each major comment below and will incorporate revisions to improve clarity and self-containment of the performance claims.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claims (orders-of-magnitude speedup, noise-suppression accuracy surpassing algorithmic baselines, and provision of time-residual estimates) are stated without any numerical values, tables, figures, error bars, or ablation results. These metrics are load-bearing for the pipeline's claimed utility.

Authors: We agree that the abstract would be strengthened by including key quantitative metrics. The main body of the manuscript already provides these details, including specific speedup factors, accuracy comparisons against algorithmic baselines with error bars, ablation studies, and figures illustrating the time-residual estimates. In the revised version we will update the abstract to incorporate representative numerical values (with appropriate references to the supporting sections, tables, and figures) while preserving its concise nature. revision: yes
Referee: [Abstract] Abstract and domain-adaptation description: the statement that domain adaptation yields 'improved agreement between the two domains' provides neither the adaptation method (adversarial, MMD, etc.), a quantitative shift metric, nor performance deltas on held-out real Baikal-GVD data versus MC. This directly affects transfer of the speedup and accuracy claims to experimental events.

Authors: The domain-adaptation procedure is described in the methods and results sections of the manuscript, including the specific technique and supporting analyses. To address the comment we will revise the abstract to name the adaptation method explicitly, report a quantitative domain-shift metric, and include performance deltas measured on held-out real Baikal-GVD data versus Monte Carlo. These additions will make the transferability of the reported speedups and accuracy gains clearer without altering the underlying results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical ML pipeline with external validation

full rationale

The paper presents an applied neural-network pipeline (transformer-based stages for EAS suppression, noise rejection, and candidate selection) trained on Monte Carlo and adapted to real Baikal-GVD data. No derivation chain exists that reduces predictions or results to inputs by construction: performance claims rest on empirical comparisons to algorithmic baselines and reported domain-agreement improvements, not on fitted parameters renamed as predictions or self-referential definitions. Domain adaptation is invoked as a practical bridge rather than a uniqueness theorem or ansatz smuggled via self-citation. The central claims are falsifiable against held-out data and standard reconstruction chains, satisfying the criteria for a self-contained empirical result with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard machine-learning assumptions about transformer capacity to capture hit correlations and the effectiveness of domain adaptation for simulation-to-data transfer. No free parameters, new physical entities, or ad-hoc axioms are mentioned in the abstract.

axioms (2)

domain assumption Transformer attention mechanisms can effectively capture inter-hit correlations in optical module activation data.
Invoked when describing the network architecture for all three stages.
domain assumption Domain adaptation techniques can sufficiently align Monte Carlo simulation distributions with real experimental data distributions.
Explicitly used to address the domain shift between simulations and data.

pith-pipeline@v0.9.0 · 5497 in / 1368 out tokens · 47757 ms · 2026-05-13T00:54:57.479766+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

T. K. Gaisser, F. Halzen and T. Stanev,Particle astrophysics with high energy neutrinos,Phys. Rept.258(1995) 173–236

work page 1995
[2]

Bartos and M

I. Bartos and M. Kowalski,Multi-messenger astrophysics,Nature Rev. Phys.2(2020) 446–458. [3]IceCubecollaboration, M. G. Aartsen et al.,Multimessenger observations of a flaring blazar coincident with high-energy neutrino IceCube-170922A,Science361(2018) eaat1378

work page 2020
[3]

Tamm and I

I. Tamm and I. M. Frank,Coherent visible radiation of fast electrons passing through matter, C. R. Acad. Sci. URSS14(1937) 109–114. – 18 –

work page 1937
[4]

Halzen and S

F. Halzen and S. R. Klein,Icecube: An instrument for neutrino astronomy,Rev. Sci. Instrum. 81(2010) 081101. [6]IceCubecollaboration, M. G. Aartsen et al.,The icecube neutrino observatory: Instrumentation and online systems,JINST12(2017) P03012. [7]KM3NeTcollaboration, S. Adri´ an-Mart´ anez et al.,Letter of intent for km3net 2.0,J. Phys. G 43(2016) 084001...

work page 2010
[5]

D. Heck, J. Knapp, J. N. Capdevielle, G. Schatz and T. Thouw,Corsika: A monte carlo code to simulate extensive air showers, inReport FZKA 6019. Forschungszentrum Karlsruhe, 1998

work page 1998
[6]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez et al.,Attention is all you need, inAdvances in Neural Information Processing Systems, vol. 30, 2017, https://papers.nips.cc/paper/7181-attention-is-all-you-need

work page 2017
[7]

Ganin, E

Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette et al., Domain-adversarial training of neural networks,J. Mach. Learn. Res.17(2016) 1–35. [13]Baikal-GVDcollaboration, V. A. Allakhverdyan et al.,Diffuse neutrino flux measurements with the Baikal-GVD neutrino telescope,Phys. Rev. D107(2023) 042005. [14]Baikal-GVDcollaboration, V. ...

work page 2016
[8]

Kharuk et al.,Application of machine learning methods in Baikal-GVD: Background noise rejection and selection of neutrino-induced events,Moscow Univ

I. Kharuk et al.,Application of machine learning methods in Baikal-GVD: Background noise rejection and selection of neutrino-induced events,Moscow Univ. Phys. Bull.79(2024) 97–103. [21]IceCubecollaboration, M. G. Aartsen et al.,Evidence for high-energy extraterrestrial neutrinos at the icecube detector,Science342(2013) 1242856. [22]Baikal-GVDcollaboration...

work page 2024
[9]

Ostapchenko,Qgsjet-ii: towards reliable description of very high energy hadronic interactions,Nucl

S. Ostapchenko,Qgsjet-ii: towards reliable description of very high energy hadronic interactions,Nucl. Phys. B Proc. Suppl.151(2006) 143–146

work page 2006
[10]

Devlin, M.-W

J. Devlin, M.-W. Chang, K. Lee and K. Toutanova,BERT: Pre-training of deep bidirectional transformers for language understanding, inProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, 2019, https://aclanthology.org/N19-1423. – 19 –

work page 2019
[11]

and Yahav, E

U. Alon and E. Yahav,On the bottleneck of graph neural networks and its practical implications, inInternational Conference on Learning Representations (ICLR), 2021, https://arxiv.org/abs/2006.05205

work page arXiv 2021
[12]

T.-Y. Lin, P. Goyal, R. Girshick, K. He and P. Doll´ ar,Focal loss for dense object detection, in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, 2017

work page 2017
[13]

Allakhverdyan, A

V. Allakhverdyan, A. Avrorin, A. Avrorin, V. Aynutdinov, R. Bannasch, Z. Bardaˇ cov´ a et al., An efficient hit finding algorithm for baikal-gvd muon reconstruction,arXiv preprint arXiv:2108.00208(2021)

work page arXiv 2021
[14]

McInnes, J

L. McInnes, J. Healy and J. Melville,Umap: Uniform manifold approximation and projection for dimension reduction, 2020

work page 2020
[15]

Haser, F

J. Haser, F. Kaether, C. Langbrandtner, M. Lindner, S. Lucht, S. Roth et al.,Afterpulse Measurements of R7081 Photomultipliers for the Double Chooz Experiment,JINST8(2013) P04029, [1301.2508]. – 20 –

work page arXiv 2013