pith. sign in

arxiv: 2607.00720 · v1 · pith:ROEMTWJPnew · submitted 2026-07-01 · 💻 cs.LG · cs.AI

Detecting the Undetectable: Enhancing Unsupervised time series Anomaly Detection via Active Learning

Pith reviewed 2026-07-02 16:16 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords active learningunsupervised anomaly detectiontime seriesmasked reconstructionminimax learningmultivariate time seriesreconstruction-based detection
0
0 comments X

The pith

Active learning with masked reconstruction feedback and minimax strategy raises unsupervised time series anomaly detection AUC by 12.39 percent across 28 test cases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to improve unsupervised anomaly detection in multivariate time series, where pure reconstruction models often fail on subtle or noisy anomalies. It adds an active learning loop that queries a small number of labels and feeds them back through two new mechanisms: masked time-series reconstruction that compels the model to capture temporal structure, and a minimax objective that treats normal and abnormal samples differently. The result is reported as an average 12.39 percent AUC gain when the framework is wrapped around seven different unsupervised backbones on four datasets. A sympathetic reader would care because labeling remains expensive yet the method claims to need only limited new labels to lift already-deployed unsupervised systems.

Core claim

The authors state that an active learning framework built on masked time-series reconstruction feedback and a minimax learning strategy can be added to existing unsupervised reconstruction-based detectors; the added loop iteratively selects informative samples for labeling, retrains the model to learn robust temporal dependencies, and differentially penalizes normal versus abnormal reconstructions, yielding a 12.39 percent average AUC improvement over the original unsupervised models in 28 test cases spanning four multivariate time-series datasets and seven backbone architectures.

What carries the argument

Masked time-series reconstruction feedback strategy paired with a minimax learning objective inside an active learning selection loop; the first forces the model to reconstruct masked segments and thereby learn temporal dependencies, while the second differentially weights reconstruction errors on normal and abnormal samples.

If this is right

  • Existing unsupervised reconstruction models can be upgraded by wrapping them in the proposed active learning loop without architectural redesign.
  • The method reduces the impact of noise inside normal samples by the differential treatment in the minimax stage.
  • Detection of near-normal anomalies improves because the masked reconstruction forces explicit modeling of temporal context.
  • The reported gains hold across seven different backbone models, indicating the framework is largely backbone-agnostic.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the cost of obtaining even a few reliable labels is higher than assumed, the net benefit of the framework would shrink relative to staying fully unsupervised.
  • The approach could be tested on streaming settings where new anomalies arrive continuously and the active query budget must be allocated online.
  • Similar masked-plus-minimax feedback might transfer to other sequential anomaly tasks such as network traffic or physiological signals.

Load-bearing premise

The additional labels obtained through active learning queries can be acquired at negligible cost and are reliable enough to drive the observed performance gains without introducing new selection bias.

What would settle it

Re-running the 28 test cases with the same number of randomly chosen labels instead of actively selected ones, or with deliberately noisy labels, and checking whether the 12.39 percent AUC lift disappears.

Figures

Figures reproduced from arXiv: 2607.00720 by Hyeongwon Kang, Jinwoo Park, Pilsung Kang, Seung Hun Han.

Figure 1
Figure 1. Figure 1: Raw time series of UCR 001 and anomaly score of each model where pink [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of proposed framework [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of dataset configuration process [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of two different ways of query sampling strategies [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of Masked time series Reconstruction based Feedback Strategy [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Bar plot showing AUC of experimental result with seven different backbone [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: PA%K curve for four datasets and seven backbone models with our proposed [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: AUC score of Transformer backbone model for four datasets when major com [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of anomaly score of Transformer backbone model when active learn [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
read the original abstract

Despite the increasing sophistication of industrial AI systems, the ability to reliably detect subtle and noisy anomalies in complex time series data remains a critical yet unresolved challenge. In large-scale industrial applications, labeling time series data is often prohibitively expensive and time-consuming, making unsupervised learning a practical and widely adopted approach. However, existing unsupervised methods frequently struggle to distinguish near-normal anomalies from normal patterns and are vulnerable to noise contamination within normal samples. To address these limitations, we propose a novel framework that leverages active learning to iteratively enhance the performance of unsupervised models. Our framework's core contributions are (1) a masked time-series reconstruction feedback strategy that forces the model to learn robust temporal dependencies, and (2) a minimax learning strategy that promotes robustness by differentially treating normal and abnormal samples. This process encourages the model to better capture the dynamics of subtle and noisy patterns. The proposed framework is evaluated across 28 test cases involving four multivariate time-series datasets and seven unsupervised backbone models. Experimental results demonstrate a 12.39% improvement in AUC compared to the original models, confirming that our method can be readily integrated into existing unsupervised reconstruction-based anomaly detection systems to significantly enhance their performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper claims that an active learning framework, incorporating a masked time-series reconstruction feedback strategy and a minimax learning strategy, can enhance existing unsupervised reconstruction-based anomaly detection models. It reports an aggregate 12.39% AUC improvement over the original unsupervised models across 28 test cases on four multivariate time-series datasets and seven backbone models.

Significance. If the reported gains prove robust and attributable to the specific proposed components rather than generic effects of labeling, the work would offer a practical, integrable enhancement for industrial time-series anomaly detection where full supervision is costly. The emphasis on compatibility with existing unsupervised backbones is a potential strength.

major comments (2)
  1. [Abstract and Experimental Results] Abstract and reported results: the central claim of a 12.39% aggregate AUC improvement supplies no per-dataset breakdowns, error bars, description of active learning budget or query strategy selection, or confirmation that the same data splits were used for unsupervised pre-training and evaluation. These omissions make it impossible to assess whether the improvement is reliable or reproducible.
  2. [Experimental Results] Experimental evaluation: the framework acquires human labels via active learning while the unsupervised baselines receive none. No ablation is shown that supplies an identical number of labels through random selection or a non-active strategy. Without this control, the specific contributions of the masked reconstruction feedback and minimax strategy cannot be isolated from the generic benefit of adding supervision, which is load-bearing for the paper's core claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We agree that additional details and controls are needed to strengthen the presentation and isolate the contributions of our proposed components. We will revise the manuscript accordingly and address each point below.

read point-by-point responses
  1. Referee: [Abstract and Experimental Results] Abstract and reported results: the central claim of a 12.39% aggregate AUC improvement supplies no per-dataset breakdowns, error bars, description of active learning budget or query strategy selection, or confirmation that the same data splits were used for unsupervised pre-training and evaluation. These omissions make it impossible to assess whether the improvement is reliable or reproducible.

    Authors: We agree that the current presentation of aggregate results limits assessment of reliability. In the revised version, we will expand the abstract and results section to report per-dataset AUC values with error bars (standard deviation across 5 random seeds). We will explicitly state the active learning budget (number of labeled samples queried per iteration and total budget as a percentage of the training set) and the query strategy (uncertainty sampling based on reconstruction error). We will also add a dedicated paragraph confirming that the unsupervised pre-training and final evaluation use identical train/validation/test splits, with full details on how the splits were generated. revision: yes

  2. Referee: [Experimental Results] Experimental evaluation: the framework acquires human labels via active learning while the unsupervised baselines receive none. No ablation is shown that supplies an identical number of labels through random selection or a non-active strategy. Without this control, the specific contributions of the masked reconstruction feedback and minimax strategy cannot be isolated from the generic benefit of adding supervision, which is load-bearing for the paper's core claim.

    Authors: The referee correctly identifies that the current experiments do not isolate the benefit of active selection from the mere addition of labels. We will add a new ablation study in which the same total number of labels is acquired via uniform random selection (instead of our active strategy) and the model is retrained with the masked reconstruction and minimax objectives. Performance will be compared against both the original unsupervised baselines and our active-learning results across the same 28 test cases. These results will be included in the experimental section with statistical significance tests to demonstrate that the gains are attributable to the proposed feedback and minimax components rather than supervision alone. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical performance claims are externally benchmarked

full rationale

The paper advances an empirical active-learning framework evaluated on four datasets and seven backbone models, reporting aggregate AUC gains against the original unsupervised baselines. No equations, parameter-fitting steps, or derivation chains appear in the text. The method components (masked reconstruction feedback, minimax strategy) are presented as design choices whose value is assessed by direct comparison to external baselines rather than by any self-referential definition or fitted-input prediction. Self-citations are not invoked as load-bearing uniqueness theorems. The evaluation protocol therefore remains self-contained against independent benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the framework implicitly assumes that active learning labels can be obtained cheaply enough to justify the loop and that the minimax objective is well-defined for the chosen backbones.

pith-pipeline@v0.9.1-grok · 5744 in / 1179 out tokens · 22073 ms · 2026-07-02T16:16:23.930578+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 31 canonical work pages · 2 internal anchors

  1. [1]

    Marti, N

    L. Marti, N. Sanchez-Pi, J. M. Molina, A. C. B. Garcia, Anomaly detec- tion based on sensor data in petroleum industry applications, Sensors 15 (2015) 2774–2797. doi:10.3390/s150202774, ce7br Times Cited:76 Cited References Count:48

  2. [2]

    D. Park, Y. Hoshi, C. C. Kemp, A multimodal anomaly detec- tor for robot-assisted feeding using an lstm-based variational autoen- coder, IEEE Robotics and Automation Letters 3 (2018) 1544–1551. doi:10.1109/LRA.2018.2801475

  3. [3]

    Z. Niu, K. Yu, X. Wu, Lstm-based vae-gan for time-series anomaly de- tection, Sensors 20 (2020). URL:https://www.mdpi.com/1424-8220/ 20/13/3738. doi:10.3390/s20133738

  4. [4]

    L. Li, J. Yan, H. Wang, Y. Jin, Anomaly detection of time series with smoothness-inducing sequential variational auto-encoder, 2021. URL: https://arxiv.org/abs/2102.01331.arXiv:2102.01331

  5. [5]

    Audibert, P

    J. Audibert, P. Michiardi, F. Guyard, S. Marti, M. A. Zuluaga, Usad: Unsupervised anomaly detection on multivariate time series, in: Pro- ceedings of the 26th ACM SIGKDD International Conference on Knowl- edge Discovery & Data Mining, KDD ’20, Association for Computing Machinery, New York, NY, USA, 2020, p. 3395–3404. URL:https:// doi.org/10.1145/3394486....

  6. [7]

    Arslan, A

    F. Arslan, A. Javaid, M. D. Z. Awan, E. ur Rehman, Anomaly detection in time series: Current focus and future challenges, in: V. K. Parimala (Ed.), Anomaly Detection, IntechOpen, Rijeka, 2023. URL:https://doi.org/10.5772/intechopen.111886. doi:10.5772/ intechopen.111886. 28

  7. [8]

    Y. Zhao, L. Deng, X. Chen, C. Guo, B. Yang, T. Kieu, F. Huang, T. B. Pedersen, K. Zheng, C. S. Jensen, A comparative study on unsupervised anomaly detection for time series: Experiments and analysis, 2022. URL: https://arxiv.org/abs/2209.04635.arXiv:2209.04635

  8. [9]

    K. Choi, J. Yi, C. Park, S. Yoon, Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines, IEEE Access 9 (2021) 120043–120065. doi:10.1109/ACCESS.2021.3107975

  9. [10]

    URL:http://dx

    Y.Yang,C.Zhang,T.Zhou,Q.Wen,L.Sun, Dcdetector:Dualattention contrastive representation learning for time series anomaly detection, in: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, ACM, 2023. URL:http://dx. doi.org/10.1145/3580305.3599295. doi:10.1145/3580305.3599295

  10. [11]

    H. Kang, P. Kang, Transformer-based multivariate time se- ries anomaly detection using inter-variable attention mechanism, Knowledge-Based Systems 290 (2024) 111507. URL:https://www. sciencedirect.com/science/article/pii/S0950705124001424. doi:https://doi.org/10.1016/j.knosys.2024.111507

  11. [12]

    M. M. Breunig, H.-P. Kriegel, R. T. Ng, J. Sander, Lof: iden- tifying density-based local outliers, SIGMOD Rec. 29 (2000) 93–104. URL:https://doi.org/10.1145/335191.335388. doi:10. 1145/335191.335388

  12. [13]

    Schölkopf, R

    B. Schölkopf, R. C. Williamson, A. Smola, J. Shawe-Taylor, J. Platt, Support vector method for novelty detection, in: S. Solla, T. Leen, K. Müller (Eds.), Advances in Neural In- formation Processing Systems, volume 12, MIT Press, 1999. URL:https://proceedings.neurips.cc/paper_files/paper/1999/ file/8725fb777f25776ffa9076e44fcfd776-Paper.pdf

  13. [14]

    D. Tax, R. Duin, Support vector data description, Machine Learning (2004)

  14. [15]

    F. T. Liu, K. M. Ting, Z.-H. Zhou, Isolation forest, in: 2008 Eighth IEEE International Conference on Data Mining, 2008, pp. 413–422. doi:10. 1109/ICDM.2008.17. 29

  15. [16]

    Z. Li, Y. Zhao, J. Han, Y. Su, R. Jiao, X. Wen, D. Pei, Multi- variate time series anomaly detection and interpretation using hierar- chical inter-metric and temporal embedding, in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Min- ing, KDD ’21, Association for Computing Machinery, New York, NY, USA, 2021, p. 3220–3230. URL:...

  16. [17]

    Y. Su, Y. Zhao, C. Niu, R. Liu, W. Sun, D. Pei, Robust anomaly detec- tion for multivariate time series through stochastic recurrent neural net- work, in: Proceedings of the 25th ACM SIGKDD International Confer- ence on Knowledge Discovery & Data Mining, KDD ’19, Association for Computing Machinery, New York, NY, USA, 2019, p. 2828–2837. URL: https://doi....

  17. [18]

    L. Shen, Z. Li, J. Kwok, Timeseries anomaly detection using temporal hierarchicalone-classnetwork, in:H.Larochelle,M.Ranzato,R.Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020, pp. 13016–13026. URL:https://proceedings.neurips.cc/paper_files/paper/2020/ file/97e401a02082021fd2495...

  18. [19]

    LeCun, B

    Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hub- bard, L. Jackel, Handwritten digit recognition with a back-propagation network, Advances in neural information processing systems 2 (1989)

  19. [20]

    D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning internal rep- resentations by error propagation, parallel distributed processing, ex- plorations in the microstructure of cognition, ed. de rumelhart and j. mcclelland. vol. 1. 1986, Biometrika 71 (1986) 599–607

  20. [21]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017)

  21. [22]

    Y. Feng, W. Zhang, Y. Fu, W. Jiang, J. Zhu, W. Ren, Sensitivehue: Multivariate time series anomaly detection by enhancing the sensitivity to normal patterns, in: Proceedings of the 30th ACM SIGKDD Confer- ence on Knowledge Discovery and Data Mining, KDD ’24, Association 30 for Computing Machinery, New York, NY, USA, 2024, p. 782–793. URL: https://doi.org/...

  22. [23]

    Y. Chen, C. Zhang, M. Ma, Y. Liu, R. Ding, B. Li, S. He, S. Rajmohan, Q. Lin, D. Zhang, Imdiffusion: Imputed diffusion models for multivariate time series anomaly detection, 2023. URL:https://arxiv.org/abs/ 2307.00754.arXiv:2307.00754

  23. [24]

    S. Kim, K. Choi, H.-S. Choi, B. Lee, S. Yoon, Towards a rigorous eval- uation of time-series anomaly detection, 2022. URL:https://arxiv. org/abs/2109.05257.arXiv:2109.05257

  24. [25]

    Koran, H

    A. Koran, H. Hojjati, N. Armanfard, Unveiling the flaws: A critical analysis of initialization effect on time series anomaly detection, 2024. URL:https://arxiv.org/abs/2408.06620.arXiv:2408.06620

  25. [26]

    A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data

    C. Zhang, D. Song, Y. Chen, X. Feng, C. Lumezanu, W. Cheng, J. Ni, B.Zong,H.Chen,N.V.Chawla,Adeepneuralnetworkforunsupervised anomaly detection and diagnosis in multivariate time series data, 2018. URL:https://arxiv.org/abs/1811.08055.arXiv:1811.08055

  26. [27]

    J. Xu, H. Wu, J. Wang, M. Long, Anomaly transformer: Time series anomaly detection with association discrepancy, 2022. URL:https:// arxiv.org/abs/2110.02642.arXiv:2110.02642

  27. [28]

    W. Wang, P. Chen, Y. Xu, Z. He, Active-mtsad: Multivariate time series anomaly detection with active learning, in: 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Net- works (DSN), 2022, pp. 263–274. doi:10.1109/DSN53405.2022.00036

  28. [29]

    R. Yu, Y. Wang, W. Wang, Amad: Active learning-based mul- tivariate time series anomaly detection for large-scale it systems, Computers & Security 137 (2024) 103603. URL:https://www. sciencedirect.com/science/article/pii/S0167404823005138. doi:https://doi.org/10.1016/j.cose.2023.103603

  29. [30]

    Huang, P

    T. Huang, P. Chen, R. Li, A semi-supervised vae based active anomaly detection framework in multivariate time series for online systems, Proceedings of the ACM Web Conference 2022 (2022). URL:https: //api.semanticscholar.org/CorpusID:248367573. 31

  30. [31]

    Bodor, T

    H. Bodor, T. V. Hoang, Z. Zhang, Little Help Makes a Big Differ- ence: Leveraging Active Learning to Improve Unsupervised Time Series AnomalyDetection,SpringerInternationalPublishing,2022,p.165–176. URL:http://dx.doi.org/10.1007/978-3-031-14135-5_13. doi:10. 1007/978-3-031-14135-5_13

  31. [32]

    Settles, Active learning literature survey (2010)

    B. Settles, Active learning literature survey (2010)

  32. [33]

    B.Zong,Q.Song,M.R.Min,W.Cheng,C.Lumezanu,D.Cho,H.Chen, Deep autoencoding gaussian mixture model for unsupervised anomaly detection, in: International Conference on Learning Representations,

  33. [34]

    URL:https://openreview.net/forum?id=BJJLHbb0-

  34. [35]

    L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, M. Kloft, Deep one-class classification, in: J. Dy, A. Krause (Eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Re- search, PMLR, 2018, pp. 4393–4402. URL:https://proceedings.mlr. press/v80/ruff18a.html

  35. [36]

    G. E. Box, G. M. Jenkins, G. C. Reinsel, G. M. Ljung, Time series analysis: forecasting and control, John Wiley & Sons, 2015

  36. [37]

    Malhotra, L

    P. Malhotra, L. Vig, G. M. Shroff, P. Agarwal, Long short term memory networks for anomaly detection in time series, in: The Eu- ropean Symposium on Artificial Neural Networks, 2015. URL:https: //api.semanticscholar.org/CorpusID:43680425

  37. [38]

    Munir, S

    M. Munir, S. A. Siddiqui, A. Dengel, S. Ahmed, Deepant: A deep learn- ing approach for unsupervised anomaly detection in time series, IEEE Access 7 (2019) 1991–2005. doi:10.1109/ACCESS.2018.2886457

  38. [39]

    Golchin, B.Rekabdar, Anomalydetectionintimeseriesdatausingre- inforcement learning, variational autoencoder, and active learning, 2025

    B. Golchin, B.Rekabdar, Anomalydetectionintimeseriesdatausingre- inforcement learning, variational autoencoder, and active learning, 2025. URL:https://arxiv.org/abs/2504.02999.arXiv:2504.02999

  39. [40]

    R. Wu, E. Keogh, Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress, IEEE Transactions on Knowledge and Data Engineering 35 (2023) 2421–2429. doi:10.1109/ TKDE.2021.3112126. 32

  40. [41]

    Tharwat, W

    A. Tharwat, W. Schenck, A survey on active learning: State-of- the-art, practical challenges and research directions, Mathematics 11 (2023). URL:https://www.mdpi.com/2227-7390/11/4/820. doi:10. 3390/math11040820

  41. [42]

    Z. Yue, Y. Wang, J. Duan, T. Yang, C. Huang, Y. Tong, B. Xu, Ts2vec: Towards universal representation of time series, 2022. URL:https:// arxiv.org/abs/2106.10466.arXiv:2106.10466

  42. [43]

    Zerveas, S

    G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, C. Eickhoff, A transformer-based framework for multivariate time series repre- sentation learning, 2020. URL:https://arxiv.org/abs/2010.02803. arXiv:2010.02803

  43. [44]

    A. P. Mathur, N. O. Tippenhauer, Swat: A water treatment testbed for research and training on ics security, 2016 International Work- shop on Cyber-Physical Systems for Smart Water Networks (Cyswater) (2016) 31–36. URL:<GotoISI>://WOS:000386536900006. doi:DOI10. 1109/cyswater.2016.7469060, bg0wk Times Cited:163 Cited Refer- ences Count:13

  44. [45]

    Abdulaal, Z

    A. Abdulaal, Z. Liu, T. Lancewicki, Practical approach to asynchronous multivariate time series anomaly detection and localization, in: Proceed- ings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, Association for Computing Machinery, New York, NY, USA, 2021, p. 2485–2494. URL:https://doi.org/10.1145/ 3447548.3467174. doi:...

  45. [46]

    K.-H. Lai, D. Zha, J. Xu, Y. Zhao, G. Wang, X. Hu, Revisiting time se- ries outlier detection: Definitions and benchmarks, in: Thirty-fifth Con- ference on Neural Information Processing Systems Datasets and Bench- marks Track (Round 1), 2021. URL:https://openreview.net/forum? id=r8IvOsnHchr

  46. [47]

    Moritz, F

    S. Moritz, F. Rehbach, S. Chandrasekaran, M. Rebolledo, T. Bartz- Beielstein, GECCO Industrial Challenge 2018 Dataset: A water qual- ity dataset for the ’Internet of Things: Online Anomaly Detection for Drinking Water Quality’ competition at the Genetic and Evolutionary Computation Conference 2018, Kyoto, Japan., 2018. 33

  47. [48]

    Angryk, P

    R. Angryk, P. Martens, B. Aydin, D. Kempton, S. Mahajan, S. Basodi, A. Ahmadzadeh, X. Cai, S. Filali Boubrahimi, S. M. Hamdi, M. Schuh, M. Georgoulis, SWAN-SF, Harvard Dataverse, 2020. URL:https: //doi.org/10.7910/DVN/EBCFKM. doi:10.7910/DVN/EBCFKM

  48. [49]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014)

  49. [50]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., Pytorch: An imper- ative style, high-performance deep learning library, Advances in neural information processing systems 32 (2019). 34