Recognition: unknown
VIGILant: an automatic classification pipeline for glitches in the Virgo detector
Pith reviewed 2026-05-10 12:38 UTC · model grok-4.3
The pith
VIGILant automatically classifies glitches in Virgo gravitational-wave data using a ResNet34 model that reaches 0.9833 accuracy on test data and has been running in daily operation since O4c.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The ResNet34 model trained on glitch spectrograms outperforms tree-based classifiers that use hand-crafted Omicron parameters, attaining 0.9772 F1 and 0.9833 accuracy on a held-out test set drawn from Virgo O3b data; the resulting VIGILant pipeline has been placed into daily production at the Virgo observatory since the start of observing run O4c, supplying real-time class labels and a dashboard that surfaces uncertain cases.
What carries the argument
VIGILant pipeline, which routes either structured Omicron parameters through tree ensembles or spectrogram images through a ResNet34 convolutional network to produce glitch class labels and scores.
If this is right
- Glitch populations can be monitored in near real time, allowing operators to correlate specific classes with detector subsystems.
- Low-confidence glitches are automatically flagged for expert inspection rather than being silently misclassified.
- The same trained model can be retrained periodically on newer labeled data to track evolving glitch morphologies.
- Downstream gravitational-wave searches receive cleaner data segments once problematic glitches are identified and excised.
Where Pith is reading between the lines
- The approach could be ported to LIGO or KAGRA by swapping the input data streams and retraining on their respective glitch catalogs.
- If class definitions are kept stable, the pipeline provides a quantitative baseline for measuring improvements in detector commissioning over successive runs.
- Integration with existing Omicron trigger generators would allow end-to-end automated flagging from raw strain data to classified output.
Load-bearing premise
The glitch types and statistical properties present in the O3b training data will continue to match those encountered in O4c without large distribution shifts or changes in how glitches are defined.
What would settle it
A sustained drop in classification accuracy below 0.95 or a sharp rise in low-confidence predictions when VIGILant processes a fresh month of O4c data would indicate that the model no longer generalizes.
Figures
read the original abstract
Glitches frequently contaminate data in gravitational-wave detectors, complicating the observation and analysis of astrophysical signals. This work introduces VIGILant, an automatic pipeline for classification and visualization of glitches in the Virgo detector. Using a curated dataset of Virgo O3b glitches, two machine learning approaches are evaluated: tree-based models (Decision Tree, Random Forest and XGBoost) using structured Omicron parameters, and Convolutional Neural Networks (ResNet) trained on spectrogram images. While tree-based models offer higher interpretability and fast training, the ResNet34 model achieved superior performance, reaching a F1 score of 0.9772 and accuracy of 0.9833 in the testing set, with inference times of tens of milliseconds per glitch. The pipeline has been deployed for daily operation at the Virgo site since observing run O4c, providing the Virgo collaboration with an interactive dashboard to monitor glitch populations and detector behavior. This allows to identify low-confidence predictions, highlighting glitches requiring further attention.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces VIGILant, an automatic pipeline for classifying and visualizing glitches in the Virgo gravitational-wave detector. It evaluates tree-based models (Decision Tree, Random Forest, XGBoost) on structured Omicron parameters against a ResNet CNN trained on spectrogram images, using a curated dataset of O3b glitches. The ResNet34 model is reported to achieve the best performance with F1 score 0.9772 and accuracy 0.9833 on the test set and inference times of tens of milliseconds; the pipeline has been deployed for daily operation at the Virgo site since O4c, including an interactive dashboard for monitoring glitch populations and low-confidence predictions.
Significance. If the reported performance generalizes, this provides a practical, deployable tool for real-time glitch monitoring that can improve data quality assessment and detector characterization during observing runs. The combination of high-accuracy CNN classification with an operational dashboard and fast inference is a concrete strength for the Virgo collaboration. Explicit credit is due for the reported deployment in O4c operations, which demonstrates end-to-end utility beyond offline evaluation.
major comments (2)
- [Abstract] Abstract: The headline F1 score of 0.9772 and accuracy of 0.9833 are presented without any information on dataset size, class balance, train-test split method, cross-validation procedure, or uncertainty estimates on the metrics. This omission is load-bearing for the central claim of superior and reliable classification performance, as it prevents assessment of overfitting or statistical robustness.
- [Abstract] Deployment statement (Abstract and concluding sections): The manuscript asserts daily operational use since O4c without supplying any O4c-specific accuracy, confusion matrix, or distribution-shift diagnostics. Because glitch morphologies and class frequencies are known to evolve between runs due to detector changes, the absence of quantitative transfer validation leaves the real-time reliability claim unsupported by evidence.
minor comments (1)
- [Abstract] Abstract: The phrasing 'This allows to identify low-confidence predictions' is grammatically awkward and should be revised for clarity (e.g., 'This allows identification of...' or 'This enables the identification of...').
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and constructive comments on our manuscript. We address each major comment point by point below, proposing revisions to enhance clarity and support for the claims where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline F1 score of 0.9772 and accuracy of 0.9833 are presented without any information on dataset size, class balance, train-test split method, cross-validation procedure, or uncertainty estimates on the metrics. This omission is load-bearing for the central claim of superior and reliable classification performance, as it prevents assessment of overfitting or statistical robustness.
Authors: We agree that the abstract would be strengthened by including these details to allow readers to better evaluate the performance claims. In the revised manuscript, we will update the abstract to report the size of the curated O3b glitch dataset, the class balance across glitch types, the train-test split method (including any stratification), the cross-validation procedure if used, and uncertainty estimates on the metrics where computable from the evaluation. This will make the headline results more interpretable without requiring immediate reference to the main text. revision: yes
-
Referee: [Abstract] Deployment statement (Abstract and concluding sections): The manuscript asserts daily operational use since O4c without supplying any O4c-specific accuracy, confusion matrix, or distribution-shift diagnostics. Because glitch morphologies and class frequencies are known to evolve between runs due to detector changes, the absence of quantitative transfer validation leaves the real-time reliability claim unsupported by evidence.
Authors: The deployment statement describes the factual integration of the pipeline into Virgo's daily operations since O4c, including the interactive dashboard for monitoring glitch populations and low-confidence predictions. The reported F1 score and accuracy are explicitly based on the O3b test set, as detailed in the manuscript body. We acknowledge that no O4c-specific quantitative metrics or distribution-shift analysis are provided, which limits claims of generalization across runs. We will revise the abstract and concluding sections to explicitly distinguish the O3b validation results from the operational deployment, noting that the latter enables ongoing qualitative monitoring by the collaboration and can support future retraining. This addresses the concern by clarifying the scope of the claims. revision: partial
Circularity Check
No circularity: standard empirical ML evaluation on held-out test set
full rationale
The paper trains and evaluates tree-based models and ResNet34 on a curated O3b glitch dataset, reporting F1 and accuracy on a held-out testing set. These metrics are computed directly from model predictions versus ground-truth labels in the test split; no equations, first-principles derivations, or quantities are defined in terms of the fitted parameters themselves. No self-citations, uniqueness theorems, or ansatzes appear in any load-bearing step. The O4c deployment statement is an operational claim without quantitative performance assertions that would require circular validation. The analysis is therefore self-contained.
Axiom & Free-Parameter Ledger
free parameters (1)
- ResNet34 and tree-model hyperparameters
axioms (2)
- domain assumption The curated O3b glitch dataset is representative of glitches in O4c
- domain assumption Omicron parameters and spectrogram images contain sufficient information to distinguish glitch classes
Reference graph
Works this paper leans on
-
[1]
Percentages (shown for the training set) indicate the fraction of each subset represented by the given class
Omicron parameters For each glitch, identified by its central GPS time, a few parameters that are useful for its characterization are computed by theOmicronalgorithm: •peakFreq: the peak frequency, that is, the fre- quency of the tile at which the SNR is maximized; •SNR: the signal-to-noise-ratio of the loudest tile forming this glitch; •amplitude: the am...
-
[2]
encoded” views intro- duced in [26]. This encoded view, using the 0.5, 2.0 and 4.0 second windows, corresponds to the “encoded134
Spectrograms Aside from theirOmicronparameters, glitches can be represented as spectrograms, using the Q-transform method [23]. In this representation, a high resolution spectrogram shows how the glitch frequency (y-axis) and amplitude (colour scale) evolve over time (x-axis). An example of each glitch class using this representation is show in Figure 4. ...
-
[3]
These models were selected for their established performance on structured data and their interpretability
Tree-based models Three tree-based ML algorithms were employed to classify glitches using theOmicronfeatures: Decision Trees (DTs), Random Forests (RFs) and Extreme Gra- dient Boosting (XGBoost). These models were selected for their established performance on structured data and their interpretability. The simplest algorithm, the Decision Tree [27], is tr...
-
[4]
sqrt","log2
Convolutional Neural Networks For the image classification we employ models from the ResNet architecture family [32]. ResNets are convo- lutional neural networks (CNNs) [33] which extract hier- archical features from images: early convolutional layers detect low-level patterns such as edges, while deeper lay- ers capture higher-level, more abstract featur...
-
[5]
fetch theOmicronunclustered triggers from the previous day
-
[6]
cluster the triggers
-
[7]
generate spectrograms
-
[8]
use the trained ResNet to get the model predic- tions
-
[9]
update the glitch dashboard. Initially,VIGILantretrieves allOmicronunclustered triggers from the previous day, with a 10-second margin to avoid splitting glitches that occur around the change of the calendar day. Furthermore, only triggers produced while the Virgo interferometer was in observing mode, i.e. officially collecting data for astrophysical anal...
-
[10]
B. P. Abbottet al.(LIGO Scientific Collaboration and Virgo Collaboration), Physical Review Letters116, 061102 (2016), arXiv:1602.03837
work page internal anchor Pith review arXiv 2016
-
[11]
J. Aasiet al.(LIGO Scientific Collaboration), Clas- sical and Quantum Gravity32, 074001 (2015), arXiv:1411.4547
work page internal anchor Pith review arXiv 2015
-
[12]
Advanced Virgo: a 2nd generation interferometric gravitational wave detector
F. Acerneseet al.(Virgo Collaboration), Classical Quan- tum Gravity32, 024001 (2015), arXiv:1408.3978
work page internal anchor Pith review arXiv 2015
-
[13]
T. Akutsu and et al., Progress of Theoretical and Exper- imental Physics2021, 05A101 (2021), arXiv:2005.05574
- [14]
-
[15]
GWTC-4.0: Tests of General Relativity. I. Overview and General Tests,
R. Abbottet al.(LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration), (2026), arXiv:2603.19019
-
[16]
B. P. Abbottet al.(LIGO Scientific Collaboration and Virgo Collaboration), Physical Review Letters121, 161101 (2018)
2018
-
[17]
A. G. Abacet al.(LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration), (2025), arXiv:2508.18083
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [18]
- [19]
-
[20]
D. Daviset al., Classical and Quantum Gravity38, 135014 (2021), arXiv:2101.11673
-
[21]
Data quality overview after re- engaging Science mode,
F. D. Renzo, “Data quality overview after re- engaging Science mode,”https://logbook.virgo- gw.eu/virgo/?r=64853(2024)
2024
- [22]
- [23]
-
[24]
F. Robinet, N. Arnaud, N. Leroy, A. Lundgren, D. Macleod, and J. McIver, SoftwareX12, 100620 (2020), arXiv:2007.11374
-
[25]
M. Zevinet al., Classical and Quantum Gravity34, 064003 (2017), arXiv:1611.04596
-
[26]
Zevinet al., The European Physical Journal Plus139, 100 (2024), arXiv:2308.15530
M. Zevinet al., The European Physical Journal Plus139, 100 (2024), arXiv:2308.15530
- [27]
-
[28]
M. Razzano, F. Di Renzo, F. Fidecaro, G. Hemming, and S. Katsanevas, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrome- ters, Detectors and Associated Equipment1048, 167959 (2023), arXiv:2301.05112
-
[29]
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
L. McInnes, J. Healy, and J. Melville, (2018), arXiv:1802.03426
work page internal anchor Pith review arXiv 2018
-
[30]
I. T. Jolliffe,Principal Component Analysis, Springer Se- ries in Statistics (Springer-Verlag, 2002)
2002
-
[31]
Robinet,Omicron: an algorithm to detect and char- acterize transient events in gravitational-wave detectors, Tech
F. Robinet,Omicron: an algorithm to detect and char- acterize transient events in gravitational-wave detectors, Tech. Rep. (Virgo, 2018)
2018
-
[32]
S. Chatterji, L. Blackburn, G. Martin, and E. Kat- savounidis, Classical and Quantum Gravity21, S1809 (2004), arXiv:gr-qc/0412119
-
[33]
D. M. Macleod, J. S. Areeda, S. B. Coughlin, T. J. Massinger, and A. L. Urban, SoftwareX13, 100657 (2021)
2021
-
[34]
T. Fernandes, S. Vieira, A. Onofre, J. Calder´ on Bustillo, A. Torres-Forn´ e, and J. A. Font, Classical and Quantum Gravity40, 195018 (2023), arXiv:2303.13917
- [35]
-
[36]
Breiman, J
L. Breiman, J. Friedman, C. J. Stone, and R. A. Ol- shen,Classification and Regression Trees(Chapman & Hall/CRC, Philadelphia, 1984)
1984
-
[37]
Breiman, Machine Learning45, 5–32 (2001)
L. Breiman, Machine Learning45, 5–32 (2001)
2001
-
[38]
J. H. Friedman, The Annals of Statistics29(2001), 10.1214/aos/1013203451
-
[39]
Chen and C
T. Chen and C. Guestrin, inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16 (Association for Computing Machinery, New York, NY, USA, 2016) p. 785–794
2016
-
[40]
Scikit-learn: Machine Learning in Python
F. Pedregosaet al., Journal of Machine Learning Re- search12, 2825 (2011), arXiv:1201.0490
work page Pith review arXiv 2011
-
[41]
K. He, X. Zhang, S. Ren, and J. Sun, in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(IEEE, 2016) p. 770–778, arXiv:1512.03385
work page internal anchor Pith review arXiv 2016
-
[42]
LeCun, B
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, Neural Compu- tation1, 541–551 (1989)
1989
-
[43]
PyTorch Image Models,
R. Wightman, “PyTorch Image Models,”https:// github.com/rwightman/pytorch-image-models(2019)
2019
-
[44]
PyTorch: An Imperative Style, High-Performance Deep Learning Library
A. Paszkeet al., inAdvances in Neural Information Pro- cessing Systems 32(Curran Associates, Inc., 2019) pp. 8024–8035, arXiv:1912.01703
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[45]
Optuna: A Next-generation Hyperparameter Optimization Framework
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, inThe 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(2019) pp. 2623– 2631, arXiv:1907.10902
work page Pith review arXiv 2019
-
[46]
Decoupled Weight Decay Regularization
I. Loshchilov and F. Hutter, (2017), arXiv:1711.05101
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[47]
L. N. Smith, (2018), arXiv:1803.09820
work page Pith review arXiv 2018
-
[48]
Bahaadini, V
S. Bahaadini, V. Noroozi, N. Rohani, S. Coughlin, M. Zevin, J. Smith, V. Kalogera, and A. Katsaggelos, Information Sciences444, 172 (2018)
2018
-
[49]
P. Laguarta, R. van der Laag, M. Lopez, T. Dooney, A. L. Miller, S. Schmidt, M. Cavaglia, S. Caudill, K. Driessens, J. Karel, R. Lenders, and C. Van Den Broeck, Classical and Quantum Gravity41, 055004 (2024), arXiv:2310.03453
-
[50]
A. Torres-Forn´ e, E. Cuoco, J. A. Font, and A. Marquina, Physical Review D102, 023011 (2020), arXiv:2002.11668
-
[51]
M. Llorens-Monteagudo, A. Torres-Forn´ e, and J. A. Font, (2025), arXiv:2511.16750
- [52]
-
[53]
Wes McKinney, inProceedings of the 9th Python in Sci- ence Conference(2010) pp. 56 – 61
2010
-
[54]
J. D. Hunter, Computing in Science & Engineering9, 90 (2007)
2007
-
[55]
Collaborative data science,
Plotly Technologies Inc., “Collaborative data science,” (2015)
2015
-
[56]
TorchVi- sion: PyTorch’s Computer Vision library,
TorchVision maintainers and contributors, “TorchVi- sion: PyTorch’s Computer Vision library,”https:// github.com/pytorch/vision(2016)
2016
-
[57]
Pillow (PIL Fork) Documentation,
A. Clark, “Pillow (PIL Fork) Documentation,” (2015)
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.