pith. machine review for the scientific record. sign in

arxiv: 2604.13687 · v1 · submitted 2026-04-15 · 🌀 gr-qc · astro-ph.IM· cs.LG

Recognition: unknown

VIGILant: an automatic classification pipeline for glitches in the Virgo detector

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:38 UTC · model grok-4.3

classification 🌀 gr-qc astro-ph.IMcs.LG
keywords gravitational wave detectorsglitch classificationmachine learningResNetVirgospectrogram analysisdata qualityreal-time monitoring
0
0 comments X

The pith

VIGILant automatically classifies glitches in Virgo gravitational-wave data using a ResNet34 model that reaches 0.9833 accuracy on test data and has been running in daily operation since O4c.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents VIGILant as a ready-to-use pipeline that ingests Virgo detector data, converts it to spectrogram images or structured Omicron parameters, and assigns each glitch to one of several known classes. Tree-based models are compared against a convolutional ResNet34, with the latter delivering an F1 score of 0.9772 and inference times of tens of milliseconds. Once trained on the O3b curated set, the system was deployed at the Virgo site to supply an interactive dashboard that flags low-confidence predictions for human review. This matters because glitches are a leading source of contamination in searches for astrophysical gravitational-wave signals; reliable, fast classification helps analysts decide which data segments require further cleaning or exclusion.

Core claim

The ResNet34 model trained on glitch spectrograms outperforms tree-based classifiers that use hand-crafted Omicron parameters, attaining 0.9772 F1 and 0.9833 accuracy on a held-out test set drawn from Virgo O3b data; the resulting VIGILant pipeline has been placed into daily production at the Virgo observatory since the start of observing run O4c, supplying real-time class labels and a dashboard that surfaces uncertain cases.

What carries the argument

VIGILant pipeline, which routes either structured Omicron parameters through tree ensembles or spectrogram images through a ResNet34 convolutional network to produce glitch class labels and scores.

If this is right

  • Glitch populations can be monitored in near real time, allowing operators to correlate specific classes with detector subsystems.
  • Low-confidence glitches are automatically flagged for expert inspection rather than being silently misclassified.
  • The same trained model can be retrained periodically on newer labeled data to track evolving glitch morphologies.
  • Downstream gravitational-wave searches receive cleaner data segments once problematic glitches are identified and excised.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be ported to LIGO or KAGRA by swapping the input data streams and retraining on their respective glitch catalogs.
  • If class definitions are kept stable, the pipeline provides a quantitative baseline for measuring improvements in detector commissioning over successive runs.
  • Integration with existing Omicron trigger generators would allow end-to-end automated flagging from raw strain data to classified output.

Load-bearing premise

The glitch types and statistical properties present in the O3b training data will continue to match those encountered in O4c without large distribution shifts or changes in how glitches are defined.

What would settle it

A sustained drop in classification accuracy below 0.95 or a sharp rise in low-confidence predictions when VIGILant processes a fresh month of O4c data would indicate that the model no longer generalizes.

Figures

Figures reproduced from arXiv: 2604.13687 by Alejandro Torres-Forn\'e, Antonio Onofre, Francesco Di Renzo, Jos\'e A. Font, Tiago Fernandes.

Figure 1
Figure 1. Figure 1: FIG. 1: Distribution of the Gravity Spy labels for Virgo [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2: UMAP projection of the curated glitch dataset, before and after the re-labeling of some glitches. The re [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3: Distribution of the labels for our Virgo O3b glitch [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4: Examples of the different glitch classes in our [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6: Confusion matrices for the validation dataset obtained with the best-performing configuration of each model [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7: Feature importance of the [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: FIG. 8: Glitches from the validation dataset where the [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9: Confusion matrix obtained for the best ResNet [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIG. 10: Distribution of the maximum softmax prob [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: FIG. 12: Examples of glitches from two out-of-domain [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
Figure 11
Figure 11. Figure 11: FIG. 11: Confusion matrix obtained for the best ResNet [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: FIG. 13: Distribution of the maximum softmax probabil [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: FIG. 14: Confusion matrix for the out-of-domain [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: FIG. 15: Example plots from the glitch dashboard for March 22, 2020. The top left panel shows the glitchgram, with [PITH_FULL_IMAGE:figures/full_fig_p013_15.png] view at source ↗
read the original abstract

Glitches frequently contaminate data in gravitational-wave detectors, complicating the observation and analysis of astrophysical signals. This work introduces VIGILant, an automatic pipeline for classification and visualization of glitches in the Virgo detector. Using a curated dataset of Virgo O3b glitches, two machine learning approaches are evaluated: tree-based models (Decision Tree, Random Forest and XGBoost) using structured Omicron parameters, and Convolutional Neural Networks (ResNet) trained on spectrogram images. While tree-based models offer higher interpretability and fast training, the ResNet34 model achieved superior performance, reaching a F1 score of 0.9772 and accuracy of 0.9833 in the testing set, with inference times of tens of milliseconds per glitch. The pipeline has been deployed for daily operation at the Virgo site since observing run O4c, providing the Virgo collaboration with an interactive dashboard to monitor glitch populations and detector behavior. This allows to identify low-confidence predictions, highlighting glitches requiring further attention.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces VIGILant, an automatic pipeline for classifying and visualizing glitches in the Virgo gravitational-wave detector. It evaluates tree-based models (Decision Tree, Random Forest, XGBoost) on structured Omicron parameters against a ResNet CNN trained on spectrogram images, using a curated dataset of O3b glitches. The ResNet34 model is reported to achieve the best performance with F1 score 0.9772 and accuracy 0.9833 on the test set and inference times of tens of milliseconds; the pipeline has been deployed for daily operation at the Virgo site since O4c, including an interactive dashboard for monitoring glitch populations and low-confidence predictions.

Significance. If the reported performance generalizes, this provides a practical, deployable tool for real-time glitch monitoring that can improve data quality assessment and detector characterization during observing runs. The combination of high-accuracy CNN classification with an operational dashboard and fast inference is a concrete strength for the Virgo collaboration. Explicit credit is due for the reported deployment in O4c operations, which demonstrates end-to-end utility beyond offline evaluation.

major comments (2)
  1. [Abstract] Abstract: The headline F1 score of 0.9772 and accuracy of 0.9833 are presented without any information on dataset size, class balance, train-test split method, cross-validation procedure, or uncertainty estimates on the metrics. This omission is load-bearing for the central claim of superior and reliable classification performance, as it prevents assessment of overfitting or statistical robustness.
  2. [Abstract] Deployment statement (Abstract and concluding sections): The manuscript asserts daily operational use since O4c without supplying any O4c-specific accuracy, confusion matrix, or distribution-shift diagnostics. Because glitch morphologies and class frequencies are known to evolve between runs due to detector changes, the absence of quantitative transfer validation leaves the real-time reliability claim unsupported by evidence.
minor comments (1)
  1. [Abstract] Abstract: The phrasing 'This allows to identify low-confidence predictions' is grammatically awkward and should be revised for clarity (e.g., 'This allows identification of...' or 'This enables the identification of...').

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive comments on our manuscript. We address each major comment point by point below, proposing revisions to enhance clarity and support for the claims where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline F1 score of 0.9772 and accuracy of 0.9833 are presented without any information on dataset size, class balance, train-test split method, cross-validation procedure, or uncertainty estimates on the metrics. This omission is load-bearing for the central claim of superior and reliable classification performance, as it prevents assessment of overfitting or statistical robustness.

    Authors: We agree that the abstract would be strengthened by including these details to allow readers to better evaluate the performance claims. In the revised manuscript, we will update the abstract to report the size of the curated O3b glitch dataset, the class balance across glitch types, the train-test split method (including any stratification), the cross-validation procedure if used, and uncertainty estimates on the metrics where computable from the evaluation. This will make the headline results more interpretable without requiring immediate reference to the main text. revision: yes

  2. Referee: [Abstract] Deployment statement (Abstract and concluding sections): The manuscript asserts daily operational use since O4c without supplying any O4c-specific accuracy, confusion matrix, or distribution-shift diagnostics. Because glitch morphologies and class frequencies are known to evolve between runs due to detector changes, the absence of quantitative transfer validation leaves the real-time reliability claim unsupported by evidence.

    Authors: The deployment statement describes the factual integration of the pipeline into Virgo's daily operations since O4c, including the interactive dashboard for monitoring glitch populations and low-confidence predictions. The reported F1 score and accuracy are explicitly based on the O3b test set, as detailed in the manuscript body. We acknowledge that no O4c-specific quantitative metrics or distribution-shift analysis are provided, which limits claims of generalization across runs. We will revise the abstract and concluding sections to explicitly distinguish the O3b validation results from the operational deployment, noting that the latter enables ongoing qualitative monitoring by the collaboration and can support future retraining. This addresses the concern by clarifying the scope of the claims. revision: partial

Circularity Check

0 steps flagged

No circularity: standard empirical ML evaluation on held-out test set

full rationale

The paper trains and evaluates tree-based models and ResNet34 on a curated O3b glitch dataset, reporting F1 and accuracy on a held-out testing set. These metrics are computed directly from model predictions versus ground-truth labels in the test split; no equations, first-principles derivations, or quantities are defined in terms of the fitted parameters themselves. No self-citations, uniqueness theorems, or ansatzes appear in any load-bearing step. The O4c deployment statement is an operational claim without quantitative performance assertions that would require circular validation. The analysis is therefore self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Performance claims rest on the representativeness of the O3b curated dataset for future operations and on the assumption that the chosen features and image representations capture class-discriminating information.

free parameters (1)
  • ResNet34 and tree-model hyperparameters
    Standard deep-learning and ensemble training choices (learning rate, depth, regularization, etc.) that are not specified in the abstract
axioms (2)
  • domain assumption The curated O3b glitch dataset is representative of glitches in O4c
    Training and test performance on this dataset is used to justify daily operational deployment
  • domain assumption Omicron parameters and spectrogram images contain sufficient information to distinguish glitch classes
    Basis for feeding these representations into the two families of models

pith-pipeline@v0.9.0 · 5492 in / 1679 out tokens · 49005 ms · 2026-05-10T12:38:55.206881+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 33 canonical work pages · 8 internal anchors

  1. [1]

    Percentages (shown for the training set) indicate the fraction of each subset represented by the given class

    Omicron parameters For each glitch, identified by its central GPS time, a few parameters that are useful for its characterization are computed by theOmicronalgorithm: •peakFreq: the peak frequency, that is, the fre- quency of the tile at which the SNR is maximized; •SNR: the signal-to-noise-ratio of the loudest tile forming this glitch; •amplitude: the am...

  2. [2]

    encoded” views intro- duced in [26]. This encoded view, using the 0.5, 2.0 and 4.0 second windows, corresponds to the “encoded134

    Spectrograms Aside from theirOmicronparameters, glitches can be represented as spectrograms, using the Q-transform method [23]. In this representation, a high resolution spectrogram shows how the glitch frequency (y-axis) and amplitude (colour scale) evolve over time (x-axis). An example of each glitch class using this representation is show in Figure 4. ...

  3. [3]

    These models were selected for their established performance on structured data and their interpretability

    Tree-based models Three tree-based ML algorithms were employed to classify glitches using theOmicronfeatures: Decision Trees (DTs), Random Forests (RFs) and Extreme Gra- dient Boosting (XGBoost). These models were selected for their established performance on structured data and their interpretability. The simplest algorithm, the Decision Tree [27], is tr...

  4. [4]

    sqrt","log2

    Convolutional Neural Networks For the image classification we employ models from the ResNet architecture family [32]. ResNets are convo- lutional neural networks (CNNs) [33] which extract hier- archical features from images: early convolutional layers detect low-level patterns such as edges, while deeper lay- ers capture higher-level, more abstract featur...

  5. [5]

    fetch theOmicronunclustered triggers from the previous day

  6. [6]

    cluster the triggers

  7. [7]

    generate spectrograms

  8. [8]

    use the trained ResNet to get the model predic- tions

  9. [9]

    real-world

    update the glitch dashboard. Initially,VIGILantretrieves allOmicronunclustered triggers from the previous day, with a 10-second margin to avoid splitting glitches that occur around the change of the calendar day. Furthermore, only triggers produced while the Virgo interferometer was in observing mode, i.e. officially collecting data for astrophysical anal...

  10. [10]

    B. P. Abbottet al.(LIGO Scientific Collaboration and Virgo Collaboration), Physical Review Letters116, 061102 (2016), arXiv:1602.03837

  11. [11]

    Advanced LIGO

    J. Aasiet al.(LIGO Scientific Collaboration), Clas- sical and Quantum Gravity32, 074001 (2015), arXiv:1411.4547

  12. [12]

    Advanced Virgo: a 2nd generation interferometric gravitational wave detector

    F. Acerneseet al.(Virgo Collaboration), Classical Quan- tum Gravity32, 024001 (2015), arXiv:1408.3978

  13. [13]

    Akutsu et al

    T. Akutsu and et al., Progress of Theoretical and Exper- imental Physics2021, 05A101 (2021), arXiv:2005.05574

  14. [14]

    A. G. Abacet al.(LIGO Scientific Collabora- tion, Virgo Collaboration and KAGRA Collaboration), The Astrophysical Journal Letters995, L18 (2025), arXiv:2508.18080

  15. [15]

    GWTC-4.0: Tests of General Relativity. I. Overview and General Tests,

    R. Abbottet al.(LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration), (2026), arXiv:2603.19019

  16. [16]

    B. P. Abbottet al.(LIGO Scientific Collaboration and Virgo Collaboration), Physical Review Letters121, 161101 (2018)

  17. [17]

    A. G. Abacet al.(LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration), (2025), arXiv:2508.18083

  18. [18]

    A. G. Abacet al.(LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration), (2025), arXiv:2509.04348

  19. [19]

    B. P. Abbottet al.(LIGO Scientific Collaboration and Virgo Collaboration), Classical and Quantum Gravity 37, 055002 (2020), arXiv:1908.11170

  20. [20]

    Davis et al

    D. Daviset al., Classical and Quantum Gravity38, 135014 (2021), arXiv:2101.11673

  21. [21]

    Data quality overview after re- engaging Science mode,

    F. D. Renzo, “Data quality overview after re- engaging Science mode,”https://logbook.virgo- gw.eu/virgo/?r=64853(2024)

  22. [22]

    Davis, T

    D. Davis, T. B. Littenberg, I. M. Romero-Shaw, M. Mill- house, J. McIver, F. Di Renzo, and G. Ashton, Classical and Quantum Gravity39, 245013 (2022), arXiv:2207.03429

  23. [23]

    Cuoco, M

    E. Cuoco, M. Cavagli` a, I. S. Heng, D. Keitel, and C. Messenger, Living Reviews in Relativity28, 2 (2025), 15 arXiv:2412.15046

  24. [24]

    Robinet, N

    F. Robinet, N. Arnaud, N. Leroy, A. Lundgren, D. Macleod, and J. McIver, SoftwareX12, 100620 (2020), arXiv:2007.11374

  25. [25]

    Zevin et al

    M. Zevinet al., Classical and Quantum Gravity34, 064003 (2017), arXiv:1611.04596

  26. [26]

    Zevinet al., The European Physical Journal Plus139, 100 (2024), arXiv:2308.15530

    M. Zevinet al., The European Physical Journal Plus139, 100 (2024), arXiv:2308.15530

  27. [27]

    Y. Wu, M. Zevin, C. P. L. Berry, K. Crowston, C. Øster- lund, Z. Doctor, S. Banagiri, C. B. Jackson, V. Kalogera, and A. K. Katsaggelos, Classical and Quantum Gravity 42, 165015 (2025), arXiv:2401.12913

  28. [28]

    Razzano, F

    M. Razzano, F. Di Renzo, F. Fidecaro, G. Hemming, and S. Katsanevas, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrome- ters, Detectors and Associated Equipment1048, 167959 (2023), arXiv:2301.05112

  29. [29]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    L. McInnes, J. Healy, and J. Melville, (2018), arXiv:1802.03426

  30. [30]

    I. T. Jolliffe,Principal Component Analysis, Springer Se- ries in Statistics (Springer-Verlag, 2002)

  31. [31]

    Robinet,Omicron: an algorithm to detect and char- acterize transient events in gravitational-wave detectors, Tech

    F. Robinet,Omicron: an algorithm to detect and char- acterize transient events in gravitational-wave detectors, Tech. Rep. (Virgo, 2018)

  32. [32]

    Chatterji, L

    S. Chatterji, L. Blackburn, G. Martin, and E. Kat- savounidis, Classical and Quantum Gravity21, S1809 (2004), arXiv:gr-qc/0412119

  33. [33]

    D. M. Macleod, J. S. Areeda, S. B. Coughlin, T. J. Massinger, and A. L. Urban, SoftwareX13, 100657 (2021)

  34. [34]

    Fernandes, S

    T. Fernandes, S. Vieira, A. Onofre, J. Calder´ on Bustillo, A. Torres-Forn´ e, and J. A. Font, Classical and Quantum Gravity40, 195018 (2023), arXiv:2303.13917

  35. [35]

    George, H

    D. George, H. Shen, and E. A. Huerta, Physical Review D97, 101501 (2018), arXiv:1706.07446

  36. [36]

    Breiman, J

    L. Breiman, J. Friedman, C. J. Stone, and R. A. Ol- shen,Classification and Regression Trees(Chapman & Hall/CRC, Philadelphia, 1984)

  37. [37]

    Breiman, Machine Learning45, 5–32 (2001)

    L. Breiman, Machine Learning45, 5–32 (2001)

  38. [38]

    J. H. Friedman, The Annals of Statistics29(2001), 10.1214/aos/1013203451

  39. [39]

    Chen and C

    T. Chen and C. Guestrin, inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16 (Association for Computing Machinery, New York, NY, USA, 2016) p. 785–794

  40. [40]

    Scikit-learn: Machine Learning in Python

    F. Pedregosaet al., Journal of Machine Learning Re- search12, 2825 (2011), arXiv:1201.0490

  41. [41]

    K. He, X. Zhang, S. Ren, and J. Sun, in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(IEEE, 2016) p. 770–778, arXiv:1512.03385

  42. [42]

    LeCun, B

    Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, Neural Compu- tation1, 541–551 (1989)

  43. [43]

    PyTorch Image Models,

    R. Wightman, “PyTorch Image Models,”https:// github.com/rwightman/pytorch-image-models(2019)

  44. [44]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    A. Paszkeet al., inAdvances in Neural Information Pro- cessing Systems 32(Curran Associates, Inc., 2019) pp. 8024–8035, arXiv:1912.01703

  45. [45]

    Optuna: A Next-generation Hyperparameter Optimization Framework

    T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, inThe 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(2019) pp. 2623– 2631, arXiv:1907.10902

  46. [46]

    Decoupled Weight Decay Regularization

    I. Loshchilov and F. Hutter, (2017), arXiv:1711.05101

  47. [47]

    L. N. Smith, (2018), arXiv:1803.09820

  48. [48]

    Bahaadini, V

    S. Bahaadini, V. Noroozi, N. Rohani, S. Coughlin, M. Zevin, J. Smith, V. Kalogera, and A. Katsaggelos, Information Sciences444, 172 (2018)

  49. [49]

    Laguarta, R

    P. Laguarta, R. van der Laag, M. Lopez, T. Dooney, A. L. Miller, S. Schmidt, M. Cavaglia, S. Caudill, K. Driessens, J. Karel, R. Lenders, and C. Van Den Broeck, Classical and Quantum Gravity41, 055004 (2024), arXiv:2310.03453

  50. [50]

    Torres-Forn´ e, E

    A. Torres-Forn´ e, E. Cuoco, J. A. Font, and A. Marquina, Physical Review D102, 023011 (2020), arXiv:2002.11668

  51. [51]

    Llorens-Monteagudo, A

    M. Llorens-Monteagudo, A. Torres-Forn´ e, and J. A. Font, (2025), arXiv:2511.16750

  52. [52]

    C. R. Harriset al., Nature585, 357 (2020), arXiv:2006.10256

  53. [53]

    Wes McKinney, inProceedings of the 9th Python in Sci- ence Conference(2010) pp. 56 – 61

  54. [54]

    J. D. Hunter, Computing in Science & Engineering9, 90 (2007)

  55. [55]

    Collaborative data science,

    Plotly Technologies Inc., “Collaborative data science,” (2015)

  56. [56]

    TorchVi- sion: PyTorch’s Computer Vision library,

    TorchVision maintainers and contributors, “TorchVi- sion: PyTorch’s Computer Vision library,”https:// github.com/pytorch/vision(2016)

  57. [57]

    Pillow (PIL Fork) Documentation,

    A. Clark, “Pillow (PIL Fork) Documentation,” (2015)