pith. sign in

arxiv: 2407.18832 · v1 · pith:52Z6IQXSnew · submitted 2024-07-26 · 💻 cs.CR

Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

Pith reviewed 2026-05-23 23:26 UTC · model grok-4.3

classification 💻 cs.CR
keywords APT detectioncyber persistenceprovenance analyticsfalse positive reductionpersistence threatsalert triagepseudo-dependency edges
0
0 comments X

The pith

A detector identifies cyber persistence threats by causally linking setup and execution phases via provenance data, cutting false positives by 93%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Cyber Persistence Detector system to identify stealthy persistence in Advanced Persistent Threat attacks. It models such operations as two phases, setup and execution, that can be related through provenance graphs. The system adds pseudo-dependency edges to bridge these phases and applies an alert triage algorithm. On standard datasets this yields a 93 percent lower average false positive rate than prior methods while supporting scalable investigation.

Core claim

Persistent operations typically manifest in two phases, the persistence setup and the subsequent persistence execution. By causally relating these phases through provenance analytics and the introduction of pseudo-dependency edges, the CPD system first discerns setups that signal an impending threat and then traces processes linked to remote connections to identify execution activities. Expert-guided edges further speed tracing and shrink log size. A novel alert triage algorithm reduces associated false positives, and evaluations on well-known datasets show an average 93 percent reduction in false positive rate compared with state-of-the-art methods.

What carries the argument

Pseudo-dependency edges that connect disjoint persistence setup and execution phases using data provenance analysis.

If this is right

  • Expert-guided edges enable faster tracing of persistence activities together with reduced log size.
  • The alert triage algorithm further lowers false positives tied to persistence threats.
  • The two-phase model supports both accurate detection and efficient investigation of cyber persistence on standard datasets.
  • Provenance-based linking of phases scales to large audit logs while maintaining low false-positive rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same causal-phase linking could be tested on other multi-stage attack behaviors such as lateral movement or data exfiltration.
  • Real-time deployment would require measuring how often attackers deliberately break the assumed causal relation between setup and execution.
  • Combining the pseudo-edges with existing host-based telemetry might extend coverage to endpoints without full system-call logging.

Load-bearing premise

Persistent operations typically manifest in two phases that can be causally related through provenance analytics and the introduced pseudo-dependency edges.

What would settle it

A labeled dataset of persistence threats in which setup and execution phases lack recoverable causal links through provenance, where the claimed false-positive reduction fails to appear.

Figures

Figures reproduced from arXiv: 2407.18832 by Kaibin Bao, Mati Ur Rehman, Muhammad Shoaib, Qi Liu, Veit Hagenmeyer, Wajih Ul Hassan.

Figure 1
Figure 1. Figure 1: Stealthiness by persistence ⋆ triages persistence-related threat alerts, and ⋆ generates accurate graphs for quick incident response. CPD is rooted in a thorough analysis of the MITRE ATT&CK framework [35] which is recognized as the most comprehensive and widely referenced directory of persistence threats. We discovered that effective persistence attacks always consist of two phases: the persistence setup … view at source ↗
Figure 2
Figure 2. Figure 2: CPD overview. CPD implements a four-step approach for detecting persistence threats, starting with the creation of a persistence setup table from audit logs that tracks potential setup actions. It then traces processes with remote connections to form sub-graphs, which are evaluated against execution rules and aligned with setup actions to form atomic graphs linked by a pseudo-edge. The process is refined t… view at source ↗
Figure 3
Figure 3. Figure 3: A persistence attack graph automatically generated by CPD on the EP-APT29-1 dataset. It uses rectangles for processes, ovals for files / Registry keys, and diamonds for network sockets. Annotations include S=Start, W=Write, C=Connect. The graph successfully pinpoints T1547.001 (Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder). The upper section reveals persistence setup: a malicious M… view at source ↗
Figure 4
Figure 4. Figure 4: An expert-guided edge is created during reconstruction of a T1543.003 persistence setup attack graph. An attacker-controlled malicious process leverages LOLBins to create a malicious service for persistence. The indicative Registry key is however modified by a Windows system process, to which no link from the malicious process can be built using logs from standard logging frameworks. Algorithm 2: EXPERT-GU… view at source ↗
Figure 5
Figure 5. Figure 5: A false-positive persistence attack graph automatically generated by CPD on the EP-APT29-1 dataset. This graph wrongly classifies an instance of T1547.001. It turns out to be a benign program, i.e., Microsoft OneDrive, leveraging Registry run keys for updates. It in fact connects back to an IP address belonging to Microsoft Corporation. before persistence techniques. However, we find that this indicator te… view at source ↗
Figure 6
Figure 6. Figure 6: CDF of threat score for false and true alerts for T1574.009 [61] uses only a program file path as condition, leading to excessive amount of alerts. We also find that the high number of alerts is partly due to the fact that the Sigma rule repository contains many similar rules created by different contributors for the same attack (sub-)techniques. We argue that the repository maintainers should more properl… view at source ↗
Figure 7
Figure 7. Figure 7: CDF of response time of CPD powershell.exe CommandInvokation(Set-WmiInstance): ... svchost.exe wmiprvse.exe powershell.exe 202.6.172.98, 443 services.exe pseudo-edge s s s s c after reboot before reboot [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: WMI persistence attack graph automatically generated by CPD on DARPA OpTC dataset. TABLE IX: Memory utilization (MB) of running CPD DARPA￾OpTC DARPA￾E5 ATLASv2 EP￾APT29-1 EP￾APT29-2 EP￾Sandworm-1 max 10830 18464 579 3625 3513 3484 mean 4484 11720 247 892 895 573 rules. We measure the stage 2 response time as the time to perform backward tracing on an indicative process with remote connection(s) while match… view at source ↗
read the original abstract

In Advanced Persistent Threat (APT) attacks, achieving stealthy persistence within target systems is often crucial for an attacker's success. This persistence allows adversaries to maintain prolonged access, often evading detection mechanisms. Recognizing its pivotal role in the APT lifecycle, this paper introduces Cyber Persistence Detector (CPD), a novel system dedicated to detecting cyber persistence through provenance analytics. CPD is founded on the insight that persistent operations typically manifest in two phases: the "persistence setup" and the subsequent "persistence execution". By causally relating these phases, we enhance our ability to detect persistent threats. First, CPD discerns setups signaling an impending persistent threat and then traces processes linked to remote connections to identify persistence execution activities. A key feature of our system is the introduction of pseudo-dependency edges (pseudo-edges), which effectively connect these disjoint phases using data provenance analysis, and expert-guided edges, which enable faster tracing and reduced log size. These edges empower us to detect persistence threats accurately and efficiently. Moreover, we propose a novel alert triage algorithm that further reduces false positives associated with persistence threats. Evaluations conducted on well-known datasets demonstrate that our system reduces the average false positive rate by 93% compared to state-of-the-art methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces the Cyber Persistence Detector (CPD), a provenance analytics-based system for detecting Advanced Persistent Threat (APT) persistence. It identifies two phases of persistent operations—setup and execution—and uses pseudo-dependency edges and expert-guided edges to causally link them, along with a novel alert triage algorithm. The key result is a claimed 93% reduction in average false positive rate compared to state-of-the-art methods on well-known datasets.

Significance. Should the 93% FPR reduction be substantiated with detailed, reproducible evaluations, the work would offer a meaningful advance in provenance-based threat detection by tackling the challenge of linking disjoint persistence phases. The pseudo-dependency edges could provide a generalizable technique for enhancing causal analysis in security provenance graphs.

major comments (2)
  1. [Abstract] The abstract asserts a 93% false-positive reduction but supplies no information on datasets, baselines, evaluation metrics, or methodology, so the data cannot be checked against the claim. This is load-bearing for the central performance result.
  2. [System overview / edge construction] The system's performance rests on pseudo-dependency edges (and expert-guided edges) correctly linking the persistence setup and execution phases without excessive false connections. No ablation study on these edges or measured precision of the linking step is reported, leaving the weakest assumption untested.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] The abstract asserts a 93% false-positive reduction but supplies no information on datasets, baselines, evaluation metrics, or methodology, so the data cannot be checked against the claim. This is load-bearing for the central performance result.

    Authors: We agree the abstract should be more self-contained. In the revision we will expand it to name the standard provenance datasets used, the state-of-the-art baselines, the primary metric (average false-positive rate), and the high-level evaluation methodology that produced the 93% reduction figure. revision: yes

  2. Referee: [System overview / edge construction] The system's performance rests on pseudo-dependency edges (and expert-guided edges) correctly linking the persistence setup and execution phases without excessive false connections. No ablation study on these edges or measured precision of the linking step is reported, leaving the weakest assumption untested.

    Authors: The observation is correct; the current manuscript reports only end-to-end results and does not contain an ablation or precision measurement for the pseudo-dependency and expert-guided edges. We will add a dedicated ablation study in the revised version that isolates the contribution of these edges and reports the precision of the linking step. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper introduces a new detection system (CPD) with novel pseudo-dependency edges and an alert triage algorithm, then evaluates its false-positive reduction on external well-known datasets against independent state-of-the-art baselines. No load-bearing step reduces by construction to fitted inputs, self-citations, or renamed prior results; the central performance claim rests on external measurement rather than tautological re-derivation of its own assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the two-phase model of persistence (a domain assumption) and on the effectiveness of newly introduced pseudo-dependency edges (an invented entity) whose value is asserted via the evaluation claim.

axioms (1)
  • domain assumption Persistent operations typically manifest in two phases: the persistence setup and the subsequent persistence execution
    This is presented as the foundational insight in the abstract.
invented entities (2)
  • pseudo-dependency edges no independent evidence
    purpose: To connect the disjoint phases of persistence setup and execution using data provenance analysis
    Newly introduced mechanism whose independent evidence is limited to the claimed evaluation results.
  • expert-guided edges no independent evidence
    purpose: To enable faster tracing and reduced log size
    Newly introduced mechanism whose independent evidence is limited to the claimed evaluation results.

pith-pipeline@v0.9.0 · 5763 in / 1271 out tokens · 27734 ms · 2026-05-23T23:26:55.752513+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages

  1. [1]

    MITRE ATT&CK

    The MITRE Corporation. “MITRE ATT&CK.” Accessed: Jan. 2023, [Online]. Available: https://attack.mitre.org

  2. [2]

    Crowdstrike 2023 global threat report,

    CrowdStrike, Inc., “Crowdstrike 2023 global threat report,” 2023. [Online]. Available: https : / / www. crowdstrike . com / global - threat - report/

  3. [3]

    Sandworm Team

    The MITRE Corporation. “Sandworm Team.” Accessed: June 2023, [Online]. Available: https://attack.mitre.org/groups/G0034/

  4. [4]

    Sandworm intrusion set campaign targeting Centreon systems,

    French Cybersecurity Agency, “Sandworm intrusion set campaign targeting Centreon systems,” 2021. [Online]. Available: https://www. cert.ssi.gouv.fr/uploads/CERTFR-2021-CTI-005.pdf

  5. [5]

    P.A.S. Webshell

    Pulsedive. “P.A.S. Webshell.” Accessed: Sept. 2023, [Online]. Avail- able: https://pulsedive.com/threat/P.A.S.%5C%20Webshell

  6. [6]

    Deep dive into the Solorigate second- stage activation

    Microsoft Threat Intelligence. “Deep dive into the Solorigate second- stage activation.” Accessed: Oct. 2023, [Online]. Available: https:// www.microsoft.com/en-us/security/blog/2021/01/20/deep-dive-into- the- solorigate- second- stage- activation- from- sunburst- to- teardrop- and-raindrop/

  7. [7]

    SolarWinds hack: the mystery of one of the biggest cyberattacks ever

    P. Paganini. “SolarWinds hack: the mystery of one of the biggest cyberattacks ever.” Accessed: Oct. 2023, [Online]. Available: https : //cybernews.com/security/solarwinds- hack- the- mystery- of- one- of- the-biggest-cyberattacks-ever/

  8. [8]

    SUNSPOT: An Implant in the Build Process

    CrowdStrike Intelligence Team. “SUNSPOT: An Implant in the Build Process.” Accessed: Oct. 2023, [Online]. Available: https : / / www . crowdstrike.com/blog/sunspot-malware-technical-analysis/

  9. [9]

    UNICORN: Runtime provenance-based detector for advanced persistent threats,

    X. Han, T. Pasqueir, A. Bates, J. Mickens, and M. Seltzer, “UNICORN: Runtime provenance-based detector for advanced persistent threats,” in Proc. Netw. Distrib. Syst. Secur. Symp. , 2020, pp. 1–18

  10. [10]

    ProTracer: Towards practical provenance tracing by alternating between logging and tainting,

    S. Ma, X. Zhang, and D. Xu, “ProTracer: Towards practical provenance tracing by alternating between logging and tainting,” in Proc. Netw. Distrib. Syst. Secur. Symp. , 2016, pp. 1–15

  11. [11]

    SLEUTH: Real- time attack scenario reconstruction from COTS audit data,

    M. N. Hossain, S. M. Milajerdi, J. Wang, B. Eshete, R. Gjomemo, R. Sekar, S. D. Stoller, and V . Venkatakrishnan, “SLEUTH: Real- time attack scenario reconstruction from COTS audit data,” in Proc. USENIX Secur. Symp., 2017, pp. 487–504

  12. [12]

    To- wards scalable cluster auditing through grammatical inference over provenance graphs,

    W. U. Hassan, L. Mark, N. Aguse, A. Bates, and T. Moyer, “To- wards scalable cluster auditing through grammatical inference over provenance graphs,” in Proc. Netw. Distrib. Syst. Secur. Symp. , 2018, pp. 1–15

  13. [13]

    HOLMES: Real-time apt detection through corre- lation of suspicious information flows,

    S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V . N. Venkatakrishnan, “HOLMES: Real-time apt detection through corre- lation of suspicious information flows,” in Proc. IEEE Symp. Secur. Privacy, 2019, pp. 1137–1152

  14. [14]

    NoDoze: Combatting threat alert fatigue with automated provenance triage,

    W. U. Hassan, S. Guo, D. Li, Z. Chen, K. Jee, Z. Li, and A. Bates, “NoDoze: Combatting threat alert fatigue with automated provenance triage,” in Proc. Netw. Distrib. Syst. Secur. Symp. , 2019, pp. 1–15

  15. [15]

    Tactical provenance analysis for endpoint detection and response systems,

    W. U. Hassan, A. Bates, and D. Marino, “Tactical provenance analysis for endpoint detection and response systems,” in Proc. IEEE Symp. Secur. Privacy, 2020, pp. 1172–1189

  16. [16]

    Combating dependence explosion in forensic analysis using alternative tag propagation seman- tics,

    M. N. Hossain, S. Sheikhi, and R. Sekar, “Combating dependence explosion in forensic analysis using alternative tag propagation seman- tics,” in Proc. IEEE Symp. Secur. Privacy , 2020, pp. 1139–1155

  17. [17]

    KAIROS: Practical intrusion detection and investigation using whole-system provenance,

    Z. Cheng, Q. Lv, J. Liang, Y . Wang, D. Sun, T. Pasquier, and X. Han, “KAIROS: Practical intrusion detection and investigation using whole-system provenance,” in Proc. IEEE Symp. Secur. Privacy, 2024, pp. 9–28

  18. [18]

    Flash: A comprehensive approach to intrusion detection via provenance graph representation learning,

    M. Rehman, H. Ahmadi, and W. Hassan, “Flash: A comprehensive approach to intrusion detection via provenance graph representation learning,” in Proc. IEEE Symp. Secur. Privacy , 2024, pp. 142–161

  19. [19]

    Shadewatcher: Recommendation-guided cyber threat analysis using system audit records,

    J. Zeng, X. Wang, J. Liu, Y . Chen, Z. Liang, T.-S. Chua, and Z. L. Chua, “Shadewatcher: Recommendation-guided cyber threat analysis using system audit records,” in Proc. IEEE Symp. Secur. Privacy, 2022, pp. 489–506

  20. [20]

    Prographer: An anomaly detection system based on provenance graph embedding,

    F. Yang, J. Xu, C. Xiong, Z. Li, and K. Zhang, “Prographer: An anomaly detection system based on provenance graph embedding,” in Proc. USENIX Secur. Symp. , 2023, pp. 4355–4372

  21. [21]

    Elastic Detection Rules

    Elastic. “Elastic Detection Rules.” Accessed: Sept. 2023, [Online]. Available: https://github.com/elastic/detection-rules

  22. [22]

    Chronicle Detection Rules

    Google Security Operations. “Chronicle Detection Rules.” Accessed: Sept. 2023, [Online]. Available: https://github.com/chronicle/detection- rules

  23. [23]

    “Sigma.” Accessed: Sept

    SigmaHQ. “Sigma.” Accessed: Sept. 2023, [Online]. Available: https: //github.com/SigmaHQ/sigma

  24. [24]

    MITRE T1547001

    The MITRE Corporation. “MITRE T1547001.” Accessed: May 2023, [Online]. Available: https://attack.mitre.org/techniques/T1547/001/

  25. [25]

    You are what you do: Hunting stealthy malware via data provenance analysis,

    Q. Wang, W. U. Hassan, D. Li, K. Jee, X. Yu, K. Zou, J. Rhee, Z. Chen, W. Cheng, C. A. Gunter, et al., “You are what you do: Hunting stealthy malware via data provenance analysis,” in Proc. Netw. Distrib. Syst. Secur. Symp., 2020, pp. 1–17

  26. [26]

    Threatrace: Detecting and tracing host-based threats in node level through provenance graph learning,

    S. Wang, Z. Wang, T. Zhou, H. Sun, X. Yin, D. Han, H. Zhang, X. Shi, and J. Yang, “Threatrace: Detecting and tracing host-based threats in node level through provenance graph learning,” IEEE Transactions on Information Forensics and Security , vol. 17, pp. 3972–3987, 2022

  27. [27]

    Z. Jia, Y . Xiong, Y . Nan, Y . Zhang, J. Zhao, and M. Wen, Magic: Detecting advanced persistent threats via masked graph representation learning, 2023. arXiv: 2310.09831 [cs.CR]

  28. [28]

    Sometimes, you aren’t what you do: Mimicry attacks against provenance graph host intrusion detection systems,

    A. Goyal, X. Han, G. Wang, and A. Bates, “Sometimes, you aren’t what you do: Mimicry attacks against provenance graph host intrusion detection systems,” in Proc. Netw. Distrib. Syst. Secur. Symp. , 2023, pp. 1–18

  29. [29]

    Evading Provenance-Based ML detectors with adversarial system actions,

    K. Mukherjee, J. Wiedemeier, T. Wang, J. Wei, F. Chen, M. Kim, M. Kantarcioglu, and K. Jee, “Evading Provenance-Based ML detectors with adversarial system actions,” in Proc. USENIX Secur. Symp., 2023, pp. 1199–1216

  30. [30]

    Alert Fatigue

    E. Segal. “Alert Fatigue.” Accessed: Aug. 2023, [Online]. Available: https://www.forbes.com/sites/edwardsegal/2021/11/08/alert-fatigue- can - lead - to - missed - cyber- threats - and - staff - retentionrecruitment - issues-study/?sh=4c96871035c9

  31. [31]

    In Cybersecurity Every Alert Matters,

    C. Robinson, “In Cybersecurity Every Alert Matters,” 2021. [Online]. Available: https://www.criticalstart.com/wp-content/uploads/2021/11/ US48277521 TLWP.pdf

  32. [32]

    99% false positives: A qualitative study of soc analysts’ perspectives on security alarms,

    B. A. Alahmadi, L. Axon, and I. Martinovic, “99% false positives: A qualitative study of soc analysts’ perspectives on security alarms,” in Proc. USENIX Secur. Symp. , 2022, pp. 2783–2800

  33. [33]

    The defenders’ dilemma,

    M. Wojtasiak, “The defenders’ dilemma,” 2023. [Online]. Available: https://info.vectra.ai/state-of-threat-detection

  34. [34]

    The Impact of Security Alert Overload,

    CRITICALSTART, “The Impact of Security Alert Overload,” 2019. [Online]. Available: https://www.criticalstart.com/wp-content/uploads/ 2021/02/CS Report-The-Impact-of-Security-Alert-Overload.pdf

  35. [35]

    MITRE Matrix

    The MITRE Corporation. “MITRE Matrix.” Accessed: Jan. 2023, [Online]. Available: https://attack.mitre.org/matrices/enterprise/

  36. [36]

    MITRE Adversary Emulation Library

    The MITRE Corporation. “MITRE Adversary Emulation Library.” Accessed: Jan. 2023, [Online]. Available: https://github.com/center- for-threat-informed-defense/adversary emulation library

  37. [37]

    MITRE Attack Stix Data

    The MITRE Corporation. “MITRE Attack Stix Data.” Accessed: April 2023, [Online]. Available: https://github.com/mitre-attack/attack-stix- data

  38. [38]

    DARPA OpTC

    M. van Opstal and W. Arbaugh. “DARPA OpTC.” Accessed: Sept. 2023, [Online]. Available: https://github.com/FiveDirections/OpTC- data

  39. [39]

    DARPA Transparent Computing

    J. Torrey. “DARPA Transparent Computing.” Accessed: Sept. 2023, [Online]. Available: https : / / github . com / darpa - i2o / Transparent - Computing

  40. [40]

    Atlas: A sequence-based learning approach for attack investigation,

    A. Alsaheel, Y . Nan, S. Ma, L. Yu, G. Walkup, Z. B. Celik, X. Zhang, and D. Xu, “Atlas: A sequence-based learning approach for attack investigation,” in Proc. USENIX Secur. Symp. , 2021, pp. 3005–3022

  41. [41]

    System Monitor

    M. Russinovich and T. Garnier. “System Monitor.” Accessed: Feb. 2023, [Online]. Available: https : / / learn . microsoft . com / en - us / sysinternals/downloads/sysmon

  42. [42]

    Grubb, The Linux audit daemon , Accessed: Feb

    S. Grubb, The Linux audit daemon , Accessed: Feb. 2023. [Online]. Available: https://linux.die.net/man/8/auditd

  43. [43]

    Atomic Red Team

    Red Canary. “Atomic Red Team.” Accessed: Jan. 2023, [Online]. Available: https://atomicredteam.io/

  44. [44]

    EQL search

    Elastic NV. “EQL search.” Accessed: Sept. 2023, [Online]. Available: https://www.elastic.co/guide/en/elasticsearch/reference/current/eql. html

  45. [45]

    R. Uetz, M. Herzog, L. Hackl ¨ander, S. Schwarz, and M. Henze, You cannot escape me: Detecting evasions of SIEM rules in enterprise networks, 2023. arXiv: 2311.10197 [cs.CR]

  46. [46]

    Allievi, A

    A. Allievi, A. Ionescu, D. A. Solomon, K. Chase, and M. E. Russi- novich, Windows Internals, Part 2, 7th Edition. Microsoft Press, 2022

  47. [47]

    “APT29.” Accessed: April 2023, [Online]

    The MITRE Corporation. “APT29.” Accessed: April 2023, [Online]. Available: https://attack.mitre.org/groups/G0016/

  48. [48]

    Wizard Spider

    The MITRE Corporation. “Wizard Spider.” Accessed: April 2023, [Online]. Available: https://attack.mitre.org/groups/G0102/. 16

  49. [49]

    Windows Remote Management

    S. White. “Windows Remote Management.” Accessed: March 2023, [Online]. Available: https://learn.microsoft.com/en-us/windows/win32/ winrm/portal

  50. [50]

    Carbanak

    The MITRE Corporation. “Carbanak.” Accessed: April 2023, [Online]. Available: https://attack.mitre.org/groups/G0008/

  51. [51]

    Elasticsearch

    Elastic NV. “Elasticsearch.” Accessed: Sept. 2023, [Online]. Available: https://www.elastic.co/

  52. [52]

    NetworkX

    NetworkX developers. “NetworkX.” Accessed: Sept. 2023, [Online]. Available: https://networkx.org/

  53. [53]

    “PyVis.” Accessed: Sept

    West Health Institute. “PyVis.” Accessed: Sept. 2023, [Online]. Avail- able: https://pyvis.readthedocs.io/en/latest/

  54. [54]

    NT Kernel Logger

    D. Marshall. “NT Kernel Logger.” Accessed: Feb. 2023, [Online]. Available: https : / / learn . microsoft . com / en - us / windows - hardware / drivers/devtest/nt-kernel-logger-trace-session

  55. [55]

    DARPA Transparent Computing E3

    A. D. Keromytis. “DARPA Transparent Computing E3.” Accessed: Sept. 2023, [Online]. Available: https : / / github . com / darpa - i2o / Transparent-Computing/blob/master/README-E3.md

  56. [56]

    ATLASv2

    A. Riddle, K. Westfall, and A. Bates. “ATLASv2.” Accessed: Oct. 2023, [Online]. Available: https://bitbucket.org/sts- lab/atlasv2/src/ master/

  57. [57]

    Carbon Black Cloud

    VMware LLC. “Carbon Black Cloud.” Accessed: Oct. 2023, [Online]. Available: https://www.vmware.com/products/carbon- black- cloud. html

  58. [58]

    MITRE Engenuity

    The MITRE Corporation. “MITRE Engenuity.” Accessed: Jan. 2023, [Online]. Available: https://attackevals.mitre-engenuity.org/

  59. [59]

    MITRE Engenuity Evaluation

    The MITRE Corporation. “MITRE Engenuity Evaluation.” Accessed: Jan. 2023, [Online]. Available: https://attackevals.mitre-engenuity.org/ enterprise/wizard-spider-sandworm/

  60. [60]

    A Elastic rule sample

    Elastic. “A Elastic rule sample.” Accessed: Sept. 2023, [Online]. Available: https://github.com/elastic/detection-rules/blob/main/rules/ windows/persistence registry uncommon.toml

  61. [61]

    A Sigma rule sample

    SigmaHQ. “A Sigma rule sample.” Accessed: Sept. 2023, [Online]. Available: https : / / github. com / SigmaHQ / sigma / blob / master / rules / windows / file / file event / file event win creation unquoted service path.yml

  62. [62]

    A Chronicle rule sample

    Google Security Operations. “A Chronicle rule sample.” Accessed: Sept. 2023, [Online]. Available: https://github.com/chronicle/detection- rules / blob / main / mitre attack / T1053 005 windows creation of scheduled task.yaral

  63. [63]

    Memory Profiler

    F. Pedregosa and P. Gervais. “Memory Profiler.” Accessed: Oct. 2023, [Online]. Available: https://pypi.org/project/memory-profiler/

  64. [64]

    Hopper: Modeling and detecting lateral move- ment,

    G. Ho, M. Dhiman, D. Akhawe, V . Paxson, S. Savage, G. M. V oelker, and D. A. Wagner, “Hopper: Modeling and detecting lateral move- ment,” in Proc. USENIX Secur. Symp. , 2021, pp. 3093–3110

  65. [65]

    Euler: Detecting network lateral move- ment via scalable temporal link prediction,

    I. J. King and H. H. Huang, “Euler: Detecting network lateral move- ment via scalable temporal link prediction,” in Proc. Netw. Distrib. Syst. Secur. Symp., 2022, pp. 1–16

  66. [66]

    Detecting credential spearphishing in enterprise settings,

    G. Ho, A. Sharma, M. Javed, V . Paxson, and D. Wagner, “Detecting credential spearphishing in enterprise settings,” inProc. USENIX Secur. Symp., 2017, pp. 469–485

  67. [67]

    Information based heavy hitters for real-time dns data exfiltration detection,

    Y . Ozery, A. Nadler, and A. Shabtai, “Information based heavy hitters for real-time dns data exfiltration detection,” in Proc. Netw. Distrib. Syst. Secur. Symp., 2024, pp. 1–15

  68. [68]

    Flow-based detection and proxy-based evasion of encrypted malware c2 traffic,

    C. Novo and R. Morla, “Flow-based detection and proxy-based evasion of encrypted malware c2 traffic,” in Proc. ACM Workshop on Artificial Intelligence and Security , 2020, pp. 83–91

  69. [69]

    UN- VEIL: A Large-Scale, automated approach to detecting ransomware,

    A. Kharaz, S. Arshad, C. Mulliner, W. Robertson, and E. Kirda, “UN- VEIL: A Large-Scale, automated approach to detecting ransomware,” in Proc. USENIX Secur. Symp. , 2016, pp. 757–772

  70. [70]

    Sok: History is a vast early warning system: Auditing the provenance of system intrusions,

    M. A. Inam, Y . Chen, A. Goyal, J. Liu, J. Mink, N. Michael, S. Gaur, A. Bates, and W. U. Hassan, “Sok: History is a vast early warning system: Auditing the provenance of system intrusions,” in Proc. IEEE Symp. Secur. Privacy, 2023, pp. 2620–2638

  71. [71]

    High fidelity data reduction for big data security dependency analyses,

    Z. Xu, Z. Wu, Z. Li, K. Jee, J. Rhee, X. Xiao, F. Xu, H. Wang, and G. Jiang, “High fidelity data reduction for big data security dependency analyses,” in Proc. ACM SIGSAC Conf. Comput. Commun. Secur. , 2016, pp. 504–516

  72. [72]

    Elise: A storage efficient logging system powered by redundancy reduction and representation learning,

    H. Ding, S. Yan, J. Zhai, and S. Ma, “Elise: A storage efficient logging system powered by redundancy reduction and representation learning,” in Proc. USENIX Secur. Symp. , 2021, pp. 3023–3040

  73. [73]

    Nodemerge: Template based efficient data reduction for big-data causality analysis,

    Y . Tang, D. Li, Z. Li, M. Zhang, K. Jee, X. Xiao, Z. Wu, J. Rhee, F. Xu, and Q. Li, “Nodemerge: Template based efficient data reduction for big-data causality analysis,” in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., 2018, pp. 1324–1337

  74. [74]

    Dependence- Preserving data compaction for scalable forensic analysis,

    M. N. Hossain, J. Wang, R. Sekar, and S. D. Stoller, “Dependence- Preserving data compaction for scalable forensic analysis,” in Proc. USENIX Secur. Symp., 2018, pp. 1723–1740