arxiv: 2603.23966 · v3 · submitted 2026-03-25 · 💻 cs.CR · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Rishikesh Sahay , Bell Eapen , Weizhi Meng , Md Rasel Al Mamun , Nikhil Kumar Dora , Manjusha Sumasadan , Sumit Kumar Tetarave , Elyson De La Cruz

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:59 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords threat huntingLLMautoencoderdeep reinforcement learningSplunkSOC triageanomaly detectioncybersecurity framework

0 comments

The pith

An integrated framework uses autoencoders, deep reinforcement learning and LLMs inside Splunk to automate threat hunting and adapt to changing SOC priorities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a threat hunting system that ingests network traffic, applies a reconstruction-based autoencoder to flag anomalies, uses a two-layer deep reinforcement learning module for initial risk triage, and employs an LLM for contextual analysis of alerts. The goal is to reduce the volume of logs that SOC analysts must review manually while supporting decisions to block, allow, or monitor traffic. The framework is built to operate inside the Splunk SIEM platform and to adjust its behavior autonomously when security objectives shift. Evaluation on a public benchmark dataset and a simulated dataset is reported to show that the system can identify suspicious and malicious traffic effectively.

Core claim

The central claim is that systematically combining traffic ingestion, autoencoder-based anomaly assessment, two-layer DRL triage, and LLM contextual analysis within Splunk produces a policy-guided framework that autonomously adapts to different SOC objectives and reliably identifies suspicious and malicious network traffic.

What carries the argument

The policy-guided threat hunting framework that chains autoencoder anomaly detection, two-layer deep reinforcement learning triage, and LLM contextual analysis inside Splunk to prioritize and explain alerts.

If this is right

The framework supports risk-based prioritization that lets analysts focus on higher-threat events.
SOC teams can use the system to make consistent block, allow, or monitor decisions across large log volumes.
The approach enables autonomous adaptation when network conditions or security policies change.
Evaluation on public and simulated datasets indicates the integrated modules can distinguish malicious traffic.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same modular pipeline could be tested on other SIEM platforms to check whether Splunk-specific integration is required.
Real-time feedback loops from analyst overrides could be added to retrain the DRL and LLM components without manual feature engineering.
Scalability limits might appear when the number of simultaneous data sources grows beyond the sizes used in the reported experiments.

Load-bearing premise

That the combination of autoencoder reconstruction error, DRL risk scoring, and LLM interpretation will deliver accurate low-false-positive decisions in live operational environments without extensive manual retuning.

What would settle it

A deployment in a production SOC where the framework generates high volumes of false positives or fails to change its triage behavior when SOC priorities are altered would show the central claim is incorrect.

Figures

Figures reproduced from arXiv: 2603.23966 by Bell Eapen, Elyson De La Cruz, Manjusha Sumasadan, Md Rasel Al Mamun, Nikhil Kumar Dora, Rishikesh Sahay, Sumit Kumar Tetarave, Weizhi Meng.

**Figure 2.** Figure 2: Use case illustrating the application of framework [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

**Figure 3.** Figure 3: Analysis of DNS traffic identified by the proposed RL and AAD triage mechanism [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: Performance Evaluation of Across Modes (A-D) [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

**Figure 5.** Figure 5: Average Reduction in Traffic Flows Forwarded to LLM Across Modes (A-D) [PITH_FULL_IMAGE:figures/full_fig_p028_5.png] view at source ↗

**Figure 6.** Figure 6: Performance Evaluation of Across Modes (A-D) on Simulated Dataset [PITH_FULL_IMAGE:figures/full_fig_p029_6.png] view at source ↗

**Figure 7.** Figure 7: Average Reduction in Traffic Forwarded to LLM for Analysis [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Analysis of flows from malicious host allow the traffic in the network, block, or monitor it for a while. The whole process aligns with the real world SOC workflows and ensures that SOC analyst verifies DRL decision and LLM insights on the SIEM tool such as Splunk rather than only relying on the LLM. 8. Discussion Our framework provides a multi-layer threat detection architecture that combines Deep Reinfor… view at source ↗

read the original abstract

With frequently evolving Advanced Persistent Threats (APTs) in cyberspace, traditional security solutions approaches have become inadequate for threat hunting for organizations. Moreover, SOC (Security Operation Centers) analysts are often overwhelmed and struggle to analyze the huge volume of logs received from diverse devices in organizations. To address these challenges, we propose an automated and dynamic threat hunting framework for monitoring evolving threats, adapting to changing network conditions, and performing risk-based prioritization for the mitigation of suspicious and malicious traffic. By integrating Agentic AI with Splunk, an established SIEM platform, we developed a unique threat hunting framework. The framework systematically and seamlessly integrates different threat hunting modules together, ranging from traffic ingestion to anomaly assessment using a reconstruction-based autoencoder, deep reinforcement learning (DRL) with two layers for initial triage, and a large language model (LLM) for contextual analysis. We evaluated the framework against a publicly available benchmark dataset, as well as against a simulated dataset. The experimental results show that the framework can effectively adapt to different SOC objectives autonomously and identify suspicious and malicious traffic. The framework enhances operational effectiveness by supporting SOC analysts in their decision-making to block, allow, or monitor network traffic. This study thus enhances cybersecurity and threat hunting literature by presenting the novel threat hunting framework for security decision-making, as well as promoting cumulative research efforts to develop more effective frameworks to battle continuously evolving cyber threats.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete Splunk pipeline that wires an autoencoder, two-layer DRL, and LLM together for SOC triage, but the simulated-data results leave the autonomous-adaptation and low-FP claims unproven.

read the letter

The paper walks through an end-to-end setup that ingests traffic, runs a reconstruction autoencoder to flag anomalies, passes the output to a two-layer DRL agent for policy-driven triage, and finishes with an LLM that supplies context before recommending block/allow/monitor actions inside Splunk. They tested the whole chain on one public benchmark and one simulated dataset and report that it adapts to different SOC objectives and surfaces malicious traffic. The practical value is real: anyone already using Splunk gets a working example of how to stitch these pieces without starting from scratch, and the policy layer in the DRL is a straightforward way to let operators change priorities without full retraining. That kind of integration note is useful for teams that need deployable examples rather than new theory. The soft spot is the evaluation. No accuracy numbers, no false-positive rates, no baseline comparisons, and no error bars appear in the description. Simulated datasets miss the volume, noise, and concept drift of live SOC logs, so the claim that the system autonomously maintains low false positives when objectives shift is not yet supported by the evidence. The stress-test note correctly flags this gap. This is the sort of applied framework paper that belongs in a reading group focused on operational security tools. Readers building or evaluating AI for SOCs would get concrete architecture details and Splunk-specific lessons. The work shows clear, honest engagement with the practical problem even though the results are preliminary. I would send it to peer review with a request for quantitative metrics and at least one more realistic test set; the integration itself is solid enough to justify referee time.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a policy-guided threat hunting framework integrating a reconstruction-based autoencoder for anomaly detection, a two-layer deep reinforcement learning (DRL) model for initial triage, and an LLM for contextual analysis, all embedded in the Splunk SIEM platform. It claims this system autonomously adapts to varying SOC objectives and effectively identifies suspicious and malicious traffic, based on evaluations against a public benchmark dataset and a simulated dataset.

Significance. If the central claims were substantiated with rigorous metrics, the work would represent a meaningful contribution to automated cybersecurity by demonstrating a practical, multi-component AI pipeline for dynamic threat hunting that reduces analyst overload. The use of an established SIEM platform and the focus on policy-driven adaptation are positive aspects that could support cumulative research in the field.

major comments (2)

[Abstract] Abstract: The assertion that 'the experimental results show that the framework can effectively adapt to different SOC objectives autonomously and identify suspicious and malicious traffic' is unsupported by any quantitative metrics, error bars, baseline comparisons, or details on avoiding post-hoc adjustments. This directly weakens the soundness of the adaptation and low-FP identification claims.
[Evaluation] Evaluation: The reported experiments rely exclusively on a public benchmark dataset and a simulated dataset. These lack the volume, noise levels, concept drift, and multi-objective variability of live SOC logs, leaving the load-bearing claim of autonomous adaptation and accurate low-false-positive triage without manual tuning untested in realistic operational conditions.

minor comments (1)

[Abstract] Abstract: The description of the framework modules would benefit from explicit mention of the policy mechanism that enables autonomous adaptation to SOC objectives.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thorough review and constructive criticism. We address each major comment in detail below, indicating the revisions we intend to implement.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that 'the experimental results show that the framework can effectively adapt to different SOC objectives autonomously and identify suspicious and malicious traffic' is unsupported by any quantitative metrics, error bars, baseline comparisons, or details on avoiding post-hoc adjustments. This directly weakens the soundness of the adaptation and low-FP identification claims.

Authors: We agree that the abstract would be improved by including quantitative metrics to support the claims. In the revised manuscript, we will update the abstract to cite specific results from our experiments, such as the adaptation performance under different policies and the achieved false positive rates. We will also ensure the evaluation section provides baseline comparisons and details on the experimental protocol to demonstrate autonomous adaptation without post-hoc adjustments. revision: yes
Referee: [Evaluation] Evaluation: The reported experiments rely exclusively on a public benchmark dataset and a simulated dataset. These lack the volume, noise levels, concept drift, and multi-objective variability of live SOC logs, leaving the load-bearing claim of autonomous adaptation and accurate low-false-positive triage without manual tuning untested in realistic operational conditions.

Authors: While public benchmark and simulated datasets are widely used for reproducibility in cybersecurity research, we acknowledge they do not fully capture the complexities of live SOC environments. In the revision, we will add a section explicitly discussing these limitations, including potential issues with concept drift and data volume. We will also outline how the framework could be extended for real-world deployment. This addresses the concern without altering the core evaluation approach. revision: partial

standing simulated objections not resolved

The provision of results from live SOC log evaluations, which would require access to proprietary operational data not available for this study.

Circularity Check

0 steps flagged

No circularity in derivation chain; claims rest on external dataset evaluations

full rationale

The paper describes an integrated framework (autoencoder anomaly detection + two-layer DRL triage + LLM contextual analysis) evaluated on a public benchmark dataset and a simulated dataset. No equations, parameter-fitting steps, or self-citations are presented that reduce any claimed prediction or adaptation result to its own inputs by construction. The experimental claims of autonomous adaptation and low false-positive triage are supported by reported performance on external data sources rather than by internal redefinition or self-referential fitting, rendering the derivation chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific free parameters or axioms; no invented entities are described.

pith-pipeline@v0.9.0 · 5585 in / 1073 out tokens · 29717 ms · 2026-05-15T00:59:13.200080+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DRL policy network … two hidden layers of 64 neurons … reward profiles (Modes A–D) … Triage Priority = DRL_Action × AAD_Score
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

reconstruction-based autoencoder … bottleneck (8-2-8) … trained on early benign traffic

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

M. A. Ferrag, M. Ndhlovu, N. Tihanyi, L. C. Cordeiro, M. Debbah, T. Lestable, N. S. Thandi, Revolu- tionizing cyber threat detection with large language models: A privacy-preserving bert-based lightweight model for iot/iiot devices (2024).arXiv:2306.14263. URLhttps://arxiv.org/abs/2306.14263

work page arXiv 2024
[2]

URLhttps://www.kaspersky.com/about/press-releases/advanced-persistent-threats-target-one-in-four-companies-in-2024

Kaspersky, Advanced persistent threats target one in four companies in 2024 (2024). URLhttps://www.kaspersky.com/about/press-releases/advanced-persistent-threats-target-one-in-four-companies-in-2024

work page 2024
[3]

URLhttps://www.fortinet.com/resources/cyberglossary/recent-cyber-attacks 33

Fortinet, 2025 fortinet global threat landscape report (2025). URLhttps://www.fortinet.com/resources/cyberglossary/recent-cyber-attacks 33

work page 2025
[4]

Nguyen, H

T. Nguyen, H. Nguyen, A. Ijaz, S. Sheikhi, A. V. Vasilakos, P. Kostakos, Large language models in 6g security: challenges and opportunities (2024).arXiv:2403.12239. URLhttps://arxiv.org/abs/2403.12239

work page arXiv 2024
[5]

Naseer, H

A. Naseer, H. Naseer, A. Ahmad, S. B. Maynard, A. Masood Siddiqui, Real-time analytics, incident response process agility and enterprise cybersecurity performance: A contingent resource-based analysis, International Journal of Information Management 59 (2021) 102334.doi:https://doi.org/10.1016/ j.ijinfomgt.2021.102334. URLhttps://www.sciencedirect.com/sci...

work page arXiv 2021
[6]

F. Wang, C. Liu, L. Shi, H. Pang, Minimaxad: A lightweight autoencoder for feature-rich anomaly detection, Computers in Industry 171 (2025) 104315.doi:https://doi.org/10.1016/j.compind. 2025.104315. URLhttps://www.sciencedirect.com/science/article/pii/S0166361525000806

work page doi:10.1016/j.compind 2025
[7]

Zeiser, B

A. Zeiser, B. ozcan, B. van Stein, T. Bäck, Evaluation of deep unsupervised anomaly detection methods with a data-centric approach for on-line inspection, Computers in Industry 146 (2023) 103852.doi: https://doi.org/10.1016/j.compind.2023.103852. URLhttps://www.sciencedirect.com/science/article/pii/S0166361523000027

work page doi:10.1016/j.compind.2023.103852 2023
[8]

Catalano, L

C. Catalano, L. Paiano, F. Calabrese, M. Cataldo, L. Mancarella, F. Tommasi, Anomaly detection in smart agriculture systems, Computers in Industry 143 (2022) 103750.doi:https://doi.org/10.1016/ j.compind.2022.103750. URLhttps://www.sciencedirect.com/science/article/pii/S0166361522001476

work page arXiv 2022
[9]

A. Tall, J. Wang, D. Han, Survey of data intensive computing technologies application to to security log data management, in: Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT ’16, Association for Computing Machinery, New York, NY, USA, 2016, p. 268–273.doi:10.1145/3006299.3006336. URLhttp...

work page doi:10.1145/3006299.3006336 2016
[10]

Badva, K

P. Badva, K. M. Ramokapane, E. Pantano, A. Rashid, Unveiling the Hunter-Gatherers: Exploring threat hunting practices and challenges in cyber defense, in: 33rd USENIX Security Symposium (USENIX Security 24), USENIX Association, Philadelphia, PA, 2024, pp. 3313–3330. URLhttps://www.usenix.org/conference/usenixsecurity24/presentation/badva

work page 2024
[11]

R. K. Gupta, S. Shukla, A. T. Rajan, S. Aravind, Utilizing Splunk for Proactive Issue Resolution in Full Stack Development Projects (2021)

work page 2021
[12]

S. Raza, R. Sapkota, M. Karkee, C. Emmanouilidis, Trism for agentic ai: A review of trust, risk, and 34 security management in llm-based agentic multi-agent systems (2025).arXiv:2506.04133. URLhttps://arxiv.org/abs/2506.04133

work page arXiv 2025
[13]

D. B. Acharya, K. Kuppan, B. Divya, Agentic ai: Autonomous intelligence for complex goals—a com- prehensive survey, IEEE Access 13 (2025) 18912–18936.doi:10.1109/ACCESS.2025.3532853

work page doi:10.1109/access.2025.3532853 2025
[14]

H. Xu, S. Wang, N. Li, K. Wang, Y. Zhao, K. Chen, T. Yu, Y. Liu, H. Wang, Large language models for cyber security: A systematic literature review (2024).arXiv:2405.04760. URLhttps://arxiv.org/abs/2405.04760

work page arXiv 2024
[15]

M. A. Ferrag, F. Alwahedi, A. Battah, B. Cherif, A. Mechri, N. Tihanyi, T. Bisztray, M. Debbah, Generative ai in cybersecurity: A comprehensive review of llm applications and vulnerabilities, Internet of Things and Cyber-Physical Systems (2025).doi:https://doi.org/10.1016/j.iotcps.2025.01. 001. URLhttps://www.sciencedirect.com/science/article/pii/S2667345...

work page doi:10.1016/j.iotcps.2025.01 2025
[16]

Handler, K

A. Handler, K. R. Larsen, R. Hackathorn, Large language models present new questions for decision support, International Journal of Information Management 79 (2024) 102811.doi:https://doi.org/ 10.1016/j.ijinfomgt.2024.102811. URLhttps://www.sciencedirect.com/science/article/pii/S0268401224000598

work page doi:10.1016/j.ijinfomgt.2024.102811 2024
[17]

N. Kshetri, Transforming cybersecurity with agentic ai to combat emerging cyber threats, Telecommu- nications Policy 49 (6) (2025) 102976.doi:https://doi.org/10.1016/j.telpol.2025.102976. URLhttps://www.sciencedirect.com/science/article/pii/S0308596125000734

work page doi:10.1016/j.telpol.2025.102976 2025
[18]

URLhttps://resources.simbian.ai/hubfs/Whitepaper/AI%20Agents%20in%20Cybersecurity% 20White%20Paper%20(1).pdf

Simbian, Ai agents in cybersecurity:ai agents in cybersecurity (2025). URLhttps://resources.simbian.ai/hubfs/Whitepaper/AI%20Agents%20in%20Cybersecurity% 20White%20Paper%20(1).pdf

work page 2025
[19]

Sheth, A

A. Sheth, A. Patel, C. Upadhyay, H. Ragothaman, B. Patil, S. K. Udayakumar, Agentic ai for au- tonomous cyber threat hunting and adaptive defense in dynamic security environments, in: 2025 IEEE International Conference on Electro Information Technology (eIT), 2025, pp. 316–321.doi: 10.1109/eIT64391.2025.11103697

work page doi:10.1109/eit64391.2025.11103697 2025
[20]

URLhttps://medium.com/@dylanhwilliams/utilizing-generative-ai-and-llms-to-automate-detection-writing-5e4ea074072e

Dylan, Utilizing Generative AI and LLMs to Automate Detection Writing (2024). URLhttps://medium.com/@dylanhwilliams/utilizing-generative-ai-and-llms-to-automate-detection-writing-5e4ea074072e

work page 2024
[21]

Balogh, M

S. Balogh, M. Mlyncek, O. Vranak, P. Zajac, Using generative ai models to support cybersecurity analysts, Electronics 13 (23) (2024).doi:10.3390/electronics13234718. URLhttps://www.mdpi.com/2079-9292/13/23/4718 35

work page doi:10.3390/electronics13234718 2024
[22]

Hillier, T

C. Hillier, T. Karroubi, Turning the hunted into the hunter via threat hunting: Life cycle, ecosystem, challenges and the great promise of ai (2022).arXiv:2204.11076. URLhttps://arxiv.org/abs/2204.11076

work page arXiv 2022
[23]

S. J. Lazer, K. Aryal, M. Gupta, E. Bertino, A survey of agentic ai and cybersecurity: Challenges, opportunities and use-case prototypes (2026).arXiv:2601.05293. URLhttps://arxiv.org/abs/2601.05293

work page arXiv 2026
[24]

Sheth, A

A. Sheth, A. Achanta, P. Matam, A. Patel, P. Sharma, N. V. P. Janapareddy, B. Patil, V. Gudur, Ai driven self-healing cybersecurity systems with agentic ai for adaptive threat response and resilience, in: 2025 IEEE Cloud Summit, 2025, pp. 147–153.doi:10.1109/Cloud-Summit64795.2025.00030

work page doi:10.1109/cloud-summit64795.2025.00030 2025
[25]

Mohsin, H

A. Mohsin, H. Janicke, A. Ibrahim, I. H. Sarker, S. Camtepe, A unified framework for human ai collaboration in security operations centers with trusted autonomy (2025).arXiv:2505.23397. URLhttps://arxiv.org/abs/2505.23397

work page arXiv 2025
[26]

Kshetri, J

N. Kshetri, J. Voas, Agentic Artificial Intelligence for Cyber Threat Management , Computer 58 (05) (2025) 86–90.doi:10.1109/MC.2025.3544797. URLhttps://doi.ieeecomputersociety.org/10.1109/MC.2025.3544797

work page doi:10.1109/mc.2025.3544797 2025
[27]

Zambare, V

P. Zambare, V. N. Thanikella, N. P. Kottur, S. A. Akula, Y. Liu, Netmoniai: An agentic ai framework for network security & monitoring (2025).arXiv:2508.10052. URLhttps://arxiv.org/abs/2508.10052

work page arXiv 2025
[28]

Y. Gu, Y. Xiong, J. Mace, Y. Jiang, Y. Hu, B. Kasikci, P. Cheng, Argos: Agentic time-series anomaly detection with autonomous rule generation via large language models (2025).arXiv:2501.14170. URLhttps://arxiv.org/abs/2501.14170

work page arXiv 2025
[29]

Y. Zhou, Y. Yuan, K. Huang, X. Hu, Can chatgpt perform a grounded theory approach to do risk analysis? an empirical study, Journal of Management Information Systems 41 (4) (2024) 982–1015. arXiv:https://doi.org/10.1080/07421222.2024.2415772,doi:10.1080/07421222.2024.2415772. URLhttps://doi.org/10.1080/07421222.2024.2415772

work page doi:10.1080/07421222.2024.2415772 2024
[30]

Sahay, M

R. Sahay, M. Sumasadan, B. Eapen, W. Meng, M. R. A. Mamu, Enhancing threat hunting with splunk and generative ai forautomated security operations (2025).doi:10.21203/rs.3.rs-7515771/v1

work page doi:10.21203/rs.3.rs-7515771/v1 2025
[31]

Jonkhout, Evaluating large language models for automated cyber security analysis processes (July 2024)

B. Jonkhout, Evaluating large language models for automated cyber security analysis processes (July 2024). URLhttp://essay.utwente.nl/100846/ 36

work page 2024
[32]

Konstantinou, D

A. Konstantinou, D. Kasimatis, W. J. Buchanan, S. U. Jan, J. Ahmad, I. Politis, N. Pitropakis, Lever- aging llms for non-security experts in threat hunting: Detecting living off the land techniques, Machine Learning and Knowledge Extraction 7 (2) (2025).doi:10.3390/make7020031. URLhttps://www.mdpi.com/2504-4990/7/2/31

work page doi:10.3390/make7020031 2025
[33]

Karlsen, X

E. Karlsen, X. Luo, N. Zincir-Heywood, M. Heywood, Benchmarking large language models for log analysis, security, and interpretation (2023).arXiv:2311.14519. URLhttps://arxiv.org/abs/2311.14519

work page arXiv 2023
[34]

Using LLMs to Automate Threat Intelligence Analysis Workflows in Security Operation Centers, July 2024

P. Tseng, Z. Yeh, X. Dai, P. Liu, Using llms to automate threat intelligence analysis workflows in security operation centers (2024).arXiv:2407.13093. URLhttps://arxiv.org/abs/2407.13093

work page arXiv 2024
[35]

Tanksale, Cyber threat hunting using large language models, in: X.-S

V. Tanksale, Cyber threat hunting using large language models, in: X.-S. Yang, S. Sherratt, N. Dey, A. Joshi (Eds.), Proceedings of Ninth International Congress on Information and Communication Tech- nology, Springer Nature Singapore, Singapore, 2024, pp. 629–641

work page 2024
[36]

Kidd, What is splunk & what does it do? a splunk intro (2024)

C. Kidd, What is splunk & what does it do? a splunk intro (2024). URLhttps://www.splunk.com/en_us/blog/learn/what-splunk-does.html

work page 2024
[37]

Z. Duan, J. Wang, Exploration of llm multi-agent application implementation based on lang- graph+crewai (2024).arXiv:2411.18241. URLhttps://arxiv.org/abs/2411.18241

work page arXiv 2024
[38]

Deep Learning for Anomaly Detection: A Survey

R. Chalapathy, S. Chawla, Deep learning for anomaly detection: A survey (2019).arXiv:1901.03407. URLhttps://arxiv.org/abs/1901.03407

work page internal anchor Pith review Pith/arXiv arXiv 2019
[39]

Splunk Inc., Correlation searches,https://docs.splunk.com/Documentation/ES/latest/Admin/ Correlationsearches

work page
[40]

Farhan, H

M. Farhan, H. Waheed ud din, S. Ullah, M. S. Hussain, M. A. Khan, T. Mazhar, U. F. Khattak, I. H. Jaghdam, Network-based intrusion detection using deep learning technique, Scientific Reports 15 (1) (2025) 25550.doi:10.1038/s41598-025-08770-0. URLhttps://doi.org/10.1038/s41598-025-08770-0

work page doi:10.1038/s41598-025-08770-0 2025
[41]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms (2017).arXiv:1707.06347. URLhttps://arxiv.org/abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017
[42]

Srinivas, B

S. Srinivas, B. Kirk, J. Zendejas, M. Espino, M. Boskovich, A. Bari, K. Dajani, N. Alzahrani, Ai- augmented soc: A survey of llms and agents for security automation, Journal of Cybersecurity and 37 Privacy 5 (4) (2025).doi:10.3390/jcp5040095. URLhttps://www.mdpi.com/2624-800X/5/4/95

work page doi:10.3390/jcp5040095 2025
[43]

Sun, S.-S

C.-Y. Sun, S.-S. Chen, Y.-H. Ho, De-identification of open-source intelligence using finetuned llama-3, High-Confidence Computing (2025) 100357doi:https://doi.org/10.1016/j.hcc.2025.100357. URLhttps://www.sciencedirect.com/science/article/pii/S2667295225000613

work page doi:10.1016/j.hcc.2025.100357 2025
[44]

Hoque, M

N. Hoque, M. H. Bhuyan, R. Baishya, D. Bhattacharyya, J. Kalita, Network attacks: Taxonomy, tools and systems, Journal of Network and Computer Applications 40 (2014) 307–324.doi:https: //doi.org/10.1016/j.jnca.2013.08.001. URLhttps://www.sciencedirect.com/science/article/pii/S1084804513001756

work page doi:10.1016/j.jnca.2013.08.001 2014
[45]

Sahay, G

R. Sahay, G. Blanc, Z. Zhang, H. Debar, Towards autonomic ddos mitigation using software defined networking, 2015. URLhttps://api.semanticscholar.org/CorpusID:18725272

work page 2015
[46]

URLhttps://www.splunk.com/en_us/blog/security/botsv3-dataset-released.html Appendix A

Splunk, Boss of the soc v3 dataset released (2020). URLhttps://www.splunk.com/en_us/blog/security/botsv3-dataset-released.html Appendix A. Mathematical Details and Numerical Illustration This appendix provides the detailed mathematical formulation and a worked numerical example supporting the reinforcement learning–based containment framework described in...

work page 2020