pith. sign in

arxiv: 2606.28923 · v1 · pith:S436TOEUnew · submitted 2026-06-27 · 💻 cs.LG

Towards Improved Anomaly Detection for Cloud Cybersecurity via Graph Neural Networks

Pith reviewed 2026-06-30 09:52 UTC · model grok-4.3

classification 💻 cs.LG
keywords anomaly detectiongraph neural networkscloud securityself-supervised learningevent logsalert reduction
0
0 comments X

The pith

A graph neural network on cloud logs reduces security alerts from thousands daily to roughly one per hour across five organizations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports results from deploying a self-supervised graph neural network that scores individual events in cloud audit logs for anomaly. The model updates its notion of normal behavior continuously and produces far fewer flags than a hand-crafted rule baseline written by security experts. Experiments cover real production logs from five separate organizations. The authors emphasize that the study measures only the volume of surfaced events and cannot quantify how many actual threats may have been missed.

Core claim

Self-supervised graph neural networks applied to sequences of cloud computing events generate per-event anomaly scores that allow analysts to review orders of magnitude fewer events than static rule sets while still responding to changes in normal usage patterns.

What carries the argument

Self-supervised graph neural network that builds graphs from cloud event sequences and outputs an anomaly score for each event.

If this is right

  • The model adapts to new organizational behavior without requiring scheduled retraining.
  • Daily alert volume falls to a level an analyst can review in a practical deployment.
  • Performance claims rest on comparison with a domain-expert rule baseline rather than labeled threat data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph construction approach could be tested on audit logs from other cloud providers.
  • A controlled red-team exercise would be needed to estimate missed threats.
  • The method's value depends on whether the remaining alerts contain a higher fraction of real incidents than the baseline.

Load-bearing premise

Lower numbers of flagged events indicate better detection performance rather than simply overlooking some malicious activity that would have been caught by the rules.

What would settle it

Insert known attack patterns into otherwise normal logs and measure whether the model assigns high anomaly scores to those specific events.

Figures

Figures reproduced from arXiv: 2606.28923 by Edward Raff, Manu Nandan, Michael Brautbar, TJ Jaymes.

Figure 1
Figure 1. Figure 1: Representation of CloudTrail logs as a graph. The nodes on [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Computational blocks in a TGN model from [1]. The sequence [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Method of feature extraction from raw event logs that is [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Graph representation of model A with a focus on relationship [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Graph representation of model B with a focus on relationship [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
read the original abstract

Detecting security threats in an organization's cloud computing environment has become necessary due to the increased reliance on cloud infrastructure. Logging of all cloud computing events enables investigation into any incidents after they are detected. Automated detection of threats using the logs based on heuristics or anomaly detection could result in a high false positive rate due to its relatively static nature. In this article, we present an industrial case study of a self-supervised learning method using graph neural networks applied to AWS CloudTrail logs to surface suspicious events for analyst review. The model produces an anomaly score for each event and dynamically adapts to changes in the organization without requiring periodic retraining. Based on our experiments across five organizations, the proposed model produced substantially fewer alerts than a domain expert rule-based baseline in almost all cases, reducing alert volumes to approximately 1 per hour from thousands generated by traditional methods. We note that this evaluation covers only flagged events, and false negatives cannot be estimated from the current data; findings should therefore be interpreted as a practical deployment study offering insights into real-world constraints rather than a fully validated detection system. We discuss these limitations and the requirements for extending the approach to other cloud environments as future work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript presents an industrial case study applying a self-supervised graph neural network to AWS CloudTrail logs for surfacing suspicious events. The model produces per-event anomaly scores and adapts dynamically without periodic retraining. Across experiments on logs from five organizations, it generates approximately one alert per hour versus thousands from a domain-expert rule-based baseline. The authors explicitly caveat that the evaluation examines only flagged events, that false negatives cannot be estimated from the data, and that the results should be read as deployment insights rather than evidence from a fully validated detector.

Significance. A reliable method that reduces alert volume by orders of magnitude while remaining adaptive would be practically significant for cloud security operations. The self-supervised formulation and lack of retraining requirement address real deployment constraints. Because the work is positioned as a case study with acknowledged limits on recall assessment, its primary contribution lies in documenting feasible GNN behavior on production logs rather than establishing superior detection performance.

major comments (1)
  1. [Abstract] Abstract: The headline empirical claim is a reduction from thousands of alerts to ~1 per hour. The same paragraph states that 'false negatives cannot be estimated from the current data' and frames the study as a 'practical deployment study' rather than validated detection. This leaves open the possibility that the volume reduction arises from suppressed events (including threats) rather than improved precision; the title's reference to 'Improved Anomaly Detection' therefore rests on an interpretation the reported evidence cannot distinguish.
minor comments (2)
  1. [Model section] The manuscript would benefit from an explicit statement of the graph-construction procedure (node/edge definitions from CloudTrail fields) and the precise self-supervised objective, even if only at high level, to allow readers to assess reproducibility.
  2. [Experiments] Per-organization alert counts or at least ranges should be reported alongside the aggregate 'almost all cases' statement to substantiate the consistency claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline empirical claim is a reduction from thousands of alerts to ~1 per hour. The same paragraph states that 'false negatives cannot be estimated from the current data' and frames the study as a 'practical deployment study' rather than validated detection. This leaves open the possibility that the volume reduction arises from suppressed events (including threats) rather than improved precision; the title's reference to 'Improved Anomaly Detection' therefore rests on an interpretation the reported evidence cannot distinguish.

    Authors: The manuscript already includes explicit language in the abstract stating that false negatives cannot be estimated from the data and that results should be read as a practical deployment study rather than evidence from a fully validated detector. The title employs 'Towards Improved' to signal an aspirational research direction rather than a completed claim of superior detection performance. We nevertheless agree that the combination of headline alert-volume numbers with the title could invite the misreading the referee identifies. We will therefore revise the title to 'Self-Supervised Graph Neural Networks for Anomaly Detection in Cloud Logs: A Multi-Organization Deployment Study' and rephrase the abstract's opening empirical sentence to foreground alert-volume reduction under the stated limitations. These changes preserve the reported observations while removing any implication that the data distinguish improved precision from reduced recall. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical alert-volume comparison is independent of model internals

full rationale

The paper reports an industrial deployment study in which a self-supervised GNN produces anomaly scores on AWS CloudTrail logs and is compared against a rule-based baseline on real organizational data from five sites. The headline result (alert volume reduced to ~1 per hour) is an observed count of flagged events; it is not derived from any equation that re-uses the same fitted parameters or self-citations as its own input. The authors explicitly flag that false negatives cannot be measured and frame the work as a case study rather than a closed-form derivation. No load-bearing step reduces by construction to a fitted input, self-definition, or author-only uniqueness theorem.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no details on specific free parameters, axioms, or invented entities; the approach relies on standard graph neural network assumptions for self-supervised anomaly scoring.

pith-pipeline@v0.9.1-grok · 5737 in / 1073 out tokens · 30236 ms · 2026-06-30T09:52:52.722730+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 15 canonical work pages · 2 internal anchors

  1. [1]

    Temporal Graph Networks for Deep Learning on Dynamic Graphs

    E. Rossi, B. Chamberlain, F. Frasca, D. Eynard, F. Monti, and M. M. Bronstein, “Temporal graph networks for deep learning on dynamic graphs,”CoRR, vol. abs/2006.10637, 2020. [Online]. Available: https://arxiv.org/abs/2006.10637

  2. [2]

    A survey of graph neural networks for social recommender systems,

    K. Sharma, Y .-C. Lee, S. Nambi, A. Salian, S. Shah, S.-W. Kim, and S. Kumar, “A survey of graph neural networks for social recommender systems,”ACM Comput. Surv., vol. 56, no. 10, Jun

  3. [3]

    Available: https://doi.org/10.1145/3661821

    [Online]. Available: https://doi.org/10.1145/3661821

  4. [4]

    Graph convolutional neural networks for web-scale recommender systems,

    R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” inProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ser. KDD ’18. New York, NY , USA: Association for Computing Machinery, 2018, p. 974–983. [Online]. Available:...

  5. [5]

    Eta prediction with graph neural networks in google maps,

    A. Derrow-Pinion, J. She, D. Wong, O. Lange, T. Hester, L. Perez, M. Nunkesser, S. Lee, X. Guo, B. Wiltshire, P. W. Battaglia, V . Gupta, A. Li, Z. Xu, A. Sanchez- Gonzalez, Y . Li, and P. Velickovic, “Eta prediction with graph neural networks in google maps,” inProceedings of the 30th ACM International Conference on Information & Knowledge Management, se...

  6. [6]

    Detecting credit card fraud via heterogeneous graph neural networks with graph attention,

    Q. Sha, T. Tang, X. Du, J. Liu, Y . Wang, and Y . Sheng, “Detecting credit card fraud via heterogeneous graph neural networks with graph attention,” in2025 IEEE 6th International Seminar on Artifi- cial Intelligence, Networking and Information Technology (AINIT), 2025, pp. 1332–1336

  7. [7]

    Heterogeneous graph neural networks for fraud detection and explanation in supply chain finance,

    B. Wu, K.-M. Chao, and Y . Li, “Heterogeneous graph neural networks for fraud detection and explanation in supply chain finance,”Information Systems, vol. 121, p. 102335, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/ pii/S0306437923001710

  8. [8]

    Unsupervised log message anomaly detection,

    A. Farzad and T. A. Gulliver, “Unsupervised log message anomaly detection,”ICT Express, vol. 6, no. 3, pp. 229–237,

  9. [9]

    Available: https://www.sciencedirect.com/science/ article/pii/S2405959520300643

    [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S2405959520300643

  10. [10]

    Deeplog: Anomaly detection and diagnosis from system logs through deep learning

    M. Du, F. Li, G. Zheng, and V . Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” inProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’17. New York, NY , USA: Association for Computing Machinery, 2017, p. 1285–1298. [Online]. Available: https://doi.org/10.1145/3133956.3134015

  11. [11]

    Advancing Cloud-Native Cyber Threat Detection with Graph-Based Feature Engineering,

    T. Song, M. Organokov, L. Gulikers, G. Grassi, G. Carofiglio, and M. Meo, “Advancing Cloud-Native Cyber Threat Detection with Graph-Based Feature Engineering,” in2025 IEEE 41st International Conference on Data Engineering (ICDE). Los Alamitos, CA, USA: IEEE Computer Society, May 2025, pp. 4291–4297. [Online]. Available: https://doi.ieeecomputersociety. or...

  12. [12]

    Trail: A Knowledge Graph-Based Approach for Attributing Advanced Persistent Threats,

    I. J. King, R. Ramirez, B. Bowman, and H. H. Huang, “Trail: A Knowledge Graph-Based Approach for Attributing Advanced Persistent Threats,” in2025 IEEE 41st International Conference on Data Engineering (ICDE), May 2025, pp. 1207–1220, iSSN: 2375-026X. [Online]. Available: https://ieeexplore.ieee.org/ document/11113100

  13. [13]

    GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis,

    M. Saqib, B. C. Fung, P. Charland, and A. Walenstein, “GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis,” in2024 IEEE 40th International Conference on Data Engineering (ICDE), May 2024, pp. 2258–2270, iSSN: 2375-026X. [Online]. Available: https://ieeexplore.ieee.org/document/10598144

  14. [14]

    A query system for efficiently investigating complex attack behaviors for enterprise security,

    P. Gao, X. Xiao, Z. Li, K. Jee, F. Xu, S. R. Kulkarni, and P. Mittal, “A query system for efficiently investigating complex attack behaviors for enterprise security,”Proc. VLDB Endow., vol. 12, no. 12, pp. 1802–1805, Aug. 2019. [Online]. Available: https://doi.org/10.14778/3352063.3352070

  15. [15]

    A System for Automated Open-Source Threat Intelligence Gathering and Management,

    P. Gao, X. Liu, E. Choi, B. Soman, C. Mishra, K. Farris, and D. Song, “A System for Automated Open-Source Threat Intelligence Gathering and Management,” inProceedings of the 2021 International Conference on Management of Data, ser. SIGMOD ’21. New York, NY , USA: Association for Computing Machinery, Jun. 2021, pp. 2716–2720. [Online]. Available: https://d...

  16. [16]

    A systematic review on anomaly detection for cloud computing environments,

    T. Hagemann and K. Katsarou, “A systematic review on anomaly detection for cloud computing environments,” inProceedings of the 2020 3rd Artificial Intelligence and Cloud Computing Conference, ser. AICCC ’20. New York, NY , USA: Association for Computing Machinery, 2021, p. 83–96. [Online]. Available: https://doi.org/10.1145/3442536.3442550

  17. [17]

    Bridging the Gap: LLM-Powered Transfer Learning for Log Anomaly Detection in New Software Systems,

    Y . Sui, X. Wang, T. Cui, T. Xiao, C. He, S. Zhang, Y . Zhang, X. Yang, Y . Sun, and D. Pei, “Bridging the Gap: LLM-Powered Transfer Learning for Log Anomaly Detection in New Software Systems,” in2025 IEEE 41st International Conference on Data Engineering (ICDE), May 2025, pp. 4414–4427, iSSN: 2375- 026X. [Online]. Available: https://ieeexplore.ieee.org/d...

  18. [18]

    Cyber security attack recognition on cloud computing networks based on graph convolutional neural network and graphsage models,

    F. Abdullayeva and S. Suleymanzade, “Cyber security attack recognition on cloud computing networks based on graph convolutional neural network and graphsage models,”Results in Control and Optimization, vol. 15, p. 100423, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/ pii/S2666720724000535

  19. [19]

    Regvd: revisiting graph neural networks for vulnerability detection,

    V .-A. Nguyen, D. Q. Nguyen, V . Nguyen, T. Le, Q. H. Tran, and D. Phung, “Regvd: revisiting graph neural networks for vulnerability detection,” inProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, ser. ICSE ’22. New York, NY , USA: Association for Computing Machinery, 2022, p. 178–182. [Online]. Avai...

  20. [20]

    Applying self-supervised learning to network intrusion detection for network flows with graph neural network,

    R. Xu, G. Wu, W. Wang, X. Gao, A. He, and Z. Zhang, “Applying self-supervised learning to network intrusion detection for network flows with graph neural network,”Computer Networks, vol. 248, p. 110495, 2024. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S138912862400327X

  21. [21]

    Predictdeep: Security analytics as a service for anomaly detection and prediction,

    M. A. Elsayed and M. Zulkernine, “Predictdeep: Security analytics as a service for anomaly detection and prediction,”IEEE Access, vol. 8, pp. 45 184–45 197, 2020

  22. [22]

    Semi-Supervised Classification with Graph Convolutional Networks

    T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,”CoRR, vol. abs/1609.02907, 2016. [Online]. Available: http://arxiv.org/abs/1609.02907

  23. [23]

    Euler: Detecting network lateral movement via scalable temporal link prediction,

    I. J. King and H. H. Huang, “Euler: Detecting network lateral movement via scalable temporal link prediction,”ACM Trans. Priv. Secur., vol. 26, no. 3, Jun. 2023. [Online]. Available: https://doi.org/10.1145/3588771

  24. [24]

    Inductive Representation Learning on Temporal Graphs,

    D. Xu, C. Ruan, E. Korpeoglu, S. Kumar, and K. Achan, “Inductive Representation Learning on Temporal Graphs,” inInternational Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=rJeW1yHYwH

  25. [25]

    Temporal graph benchmark for machine learning on temporal graphs,

    S. Huang, F. Poursafaei, J. Danovitch, M. Fey, W. Hu, E. Rossi, J. Leskovec, M. Bronstein, G. Rabusseau, and R. Rabbany, “Temporal graph benchmark for machine learning on temporal graphs,” inAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, ...

  26. [26]

    Tgl: a general framework for temporal gnn training on billion-scale graphs,

    H. Zhou, D. Zheng, I. Nisa, V . Ioannidis, X. Song, and G. Karypis, “Tgl: a general framework for temporal gnn training on billion-scale graphs,”Proc. VLDB Endow., vol. 15, no. 8, p. 1572–1580, Apr. 2022. [Online]. Available: https: //doi.org/10.14778/3529337.3529342

  27. [27]

    Neural message passing for quantum chemistry,

    J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” inProceedings of the 34th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, D. Precup and Y . W. Teh, Eds., vol. 70. PMLR, 06– 11 Aug 2017, pp. 1263–1272. [Online]. Available: https: //proceedings...