pith. machine review for the scientific record. sign in

arxiv: 2605.09534 · v1 · submitted 2026-05-10 · 💻 cs.CR · cs.AI

Recognition: no theorem link

Governing AI-Assisted Security Operations: A Design Science Framework for Operational Decision Support

Elyson A. De La Cruz, Md Rasel Al Mamun, Rishikesh Sahay

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:58 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords AI governancesecurity operationsdesign science researchoperational decision supportquery brokerrisk mitigationaccountability framework
0
0 comments X

The pith

AI-assisted security operations must be governed as an engineering capability before they are scaled as automation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that introducing generative AI and retrieval-augmented tools into security operations centers creates privacy, cost, performance, and decision-quality risks that cannot be handled by technical safeguards alone. It develops a management framework through design science that treats AI-assisted decision support as a governed engineering process rather than direct automation. The framework is instantiated with a query-broker artifact that keeps AI planning separate from execution using approved templates, policy checks, read-only access, and auditable traces. A sympathetic reader would care because the approach gives concrete ways to assign roles, set maturity stages, and define quality gates while preserving accountability in high-risk settings. The study uses KQL queries on security telemetry as a bounded example of this broader governance problem.

Core claim

Using design science research, the study develops a governed AI query-broker artifact that separates AI planning from operational execution through schema-grounded retrieval, approved templates, policy validation, read-only adapters, normalized outputs, auditable agent traces, and engineering review board gates. The contribution is a management framework for governing AI-assisted operational decision support in high-risk digital infrastructure by specifying design propositions, role accountability, maturity stages, quality gates, evaluation criteria, and evidence boundaries.

What carries the argument

The governed AI query-broker artifact that separates AI planning from operational execution using schema-grounded retrieval, approved templates, policy validation, read-only adapters, normalized outputs, auditable agent traces, and engineering review board gates.

If this is right

  • Design propositions and role accountability mappings can be used to structure oversight of AI in other high-risk operational domains.
  • Maturity stages provide a sequence for organizations to move from initial pilots to scaled use while maintaining defined quality gates.
  • Evaluation criteria and evidence boundaries give measurable standards for auditing AI-assisted decisions and tracing accountability.
  • The separation of planning from execution limits the blast radius of any single AI-generated action.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same broker pattern could be tested in non-security domains such as financial compliance queries where auditability is also required.
  • Organizations could run controlled experiments comparing decision quality and risk metrics with and without the review board gates.
  • The framework implies that regulatory requirements for AI in critical infrastructure might be met by embedding these evidence boundaries rather than by restricting AI use outright.
  • Similar artifacts might reduce the need for constant human review of every AI output once the gates are shown to hold in practice.

Load-bearing premise

The proposed design elements of the query-broker artifact will effectively mitigate privacy, cost, performance, and decision-quality risks when applied in real operational settings.

What would settle it

Deploy the query-broker in a live security operations center, generate AI-assisted queries over a fixed period, and check whether unauthorized sensitive data exposure, uncontrolled costs, or untraceable decisions still occur at rates above baseline.

Figures

Figures reproduced from arXiv: 2605.09534 by Elyson A. De La Cruz, Md Rasel Al Mamun, Rishikesh Sahay.

Figure 1
Figure 1. Figure 1: Governed AI query-broker architecture for enterprise threat hunting. [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
read the original abstract

Engineering managers increasingly must decide how to introduce generative artificial intelligence (AI), retrieval-augmented generation, and coding agents into high-risk operational functions without weakening accountability, privacy, cost discipline, or auditability. The central message of this study is that AI-assisted operational decision support should be managed as a governed engineering capability before it is scaled as automation. Security operations centers (SOCs) provide a suitable setting because they combine privileged telemetry, specialist expertise, software repositories, cloud services, and evidence-sensitive decisions. This study uses Kusto Query Language (KQL) and Microsoft Azure security capabilities as a bounded technical instantiation of that broader engineering management problem. KQL is read-only in ordinary query use, but read-only does not mean risk-free: AI-assisted queries can still create privacy, cost, performance, schema-validity, and decision-quality risks through broad scans, sensitive-field exposure, stale intelligence, and misleading interpretations. Using design science research, the study develops a governed AI query-broker artifact that separates AI planning from operational execution through schema-grounded retrieval, approved templates, policy validation, read-only adapters, normalized outputs, auditable agent traces, and engineering review board gates. The contribution is not a new KQL technique, security product, or detection algorithm. Rather, the study contributes a management framework for governing AI-assisted operational decision support in high-risk digital infrastructure by specifying design propositions, role accountability, maturity stages, quality gates, evaluation criteria, and evidence boundaries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims to develop, using design science research, a governed AI query-broker artifact for AI-assisted operational decision support in security operations centers (SOCs) using KQL and Azure. This artifact separates AI planning from execution via schema-grounded retrieval, approved templates, policy validation, read-only adapters, auditable traces, and engineering review board gates. The main contribution is a management framework specifying design propositions, role accountability, maturity stages, quality gates, evaluation criteria, and evidence boundaries to govern AI use and mitigate associated risks in high-risk digital infrastructure.

Significance. Should the proposed mechanisms prove effective in practice, the framework would represent a significant step toward responsible integration of generative AI into security operations by prioritizing governance, accountability, and risk mitigation. It addresses a timely issue in engineering management of AI in critical infrastructure. The design science approach allows for a structured proposal of the artifact, though its impact depends on future validation.

major comments (2)
  1. Abstract: The central claim that the query-broker artifact mitigates privacy, cost, performance, and decision-quality risks through schema-grounded retrieval, approved templates, policy validation, read-only adapters, auditable traces, and engineering review board gates is unsupported by evaluation results, case studies, empirical evidence, threat models, failure-mode analysis, prototype implementation, or pilot metrics. The mechanisms remain at a conceptual level without demonstrated risk reduction in operational settings.
  2. Design Science Framework (full text): The design propositions, maturity stages, quality gates, and evaluation criteria are constructed internally by the authors without grounding in external benchmarks, prior validated models, or independent data, creating circularity where the artifact defines its own success criteria rather than being tested against real-world outcomes.
minor comments (1)
  1. Abstract: The explicit statement that the contribution is a management framework (not a new KQL technique or detection algorithm) is helpful, but the manuscript would benefit from a dedicated comparison subsection to related governance frameworks in AI ethics or security operations management.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the timeliness of governing AI in security operations. We address each major comment below, clarifying the scope of our design science contribution while committing to revisions that strengthen the manuscript without overstating its current empirical basis.

read point-by-point responses
  1. Referee: Abstract: The central claim that the query-broker artifact mitigates privacy, cost, performance, and decision-quality risks through schema-grounded retrieval, approved templates, policy validation, read-only adapters, auditable traces, and engineering review board gates is unsupported by evaluation results, case studies, empirical evidence, threat models, failure-mode analysis, prototype implementation, or pilot metrics. The mechanisms remain at a conceptual level without demonstrated risk reduction in operational settings.

    Authors: We agree that the manuscript presents a design science proposal rather than an empirically validated implementation, and the abstract's phrasing of risk mitigation should be qualified to reflect this. The contribution lies in specifying the governed artifact and management framework through design propositions; empirical demonstration of risk reduction is outside the current scope but is explicitly bounded in the paper. In revision we will (1) temper the abstract language to emphasize 'proposed mechanisms for risk mitigation' and (2) add a dedicated subsection on 'Evaluation Boundaries and Planned Validation' that outlines threat models, failure-mode analysis, and pilot metrics as future work. This addresses the concern directly while preserving the design science character of the work. revision: partial

  2. Referee: Design Science Framework (full text): The design propositions, maturity stages, quality gates, and evaluation criteria are constructed internally by the authors without grounding in external benchmarks, prior validated models, or independent data, creating circularity where the artifact defines its own success criteria rather than being tested against real-world outcomes.

    Authors: The design propositions and criteria are explicitly derived from established design science methodology (Hevner et al.) and draw upon prior validated models in AI governance, cybersecurity risk frameworks (NIST AI RMF), and SOC operational literature. Nevertheless, we accept that additional explicit linkages would reduce any appearance of circularity. In revision we will insert a new table mapping each design proposition and quality gate to specific external benchmarks and literature sources, and we will expand the 'Related Work' section to highlight these connections. This will demonstrate grounding beyond internal construction while retaining the novel synthesis of the query-broker artifact. revision: yes

Circularity Check

0 steps flagged

No circularity detected in design-science framework proposal

full rationale

The paper applies design science to construct and describe a management framework (query-broker artifact, design propositions, maturity stages, quality gates, evaluation criteria, and evidence boundaries) for governing AI-assisted SOC operations. No equations, fitted parameters, predictions, or self-citations appear in the text that reduce any claim to its own inputs by construction. The listed mechanisms and criteria are presented as the direct output of the design process itself rather than derived from or validated against external benchmarks in a manner that creates circularity. This is a standard, self-contained conceptual contribution and does not match any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on domain assumptions about AI risks in SOCs and the suitability of design science, plus newly introduced entities like the query-broker without independent evidence or testing.

axioms (2)
  • domain assumption AI-assisted queries create privacy, cost, performance, schema-validity, and decision-quality risks even when read-only.
    Explicitly stated in the abstract as motivation for the governance approach.
  • domain assumption Design science research is an appropriate method for developing a governed AI query-broker artifact.
    The study applies this method to create and position the framework.
invented entities (2)
  • Governed AI query-broker artifact no independent evidence
    purpose: Separates AI planning from operational execution to manage risks
    Newly proposed as the core technical instantiation of the framework.
  • Engineering review board gates no independent evidence
    purpose: Provide oversight and approval for AI-assisted operations
    Introduced as part of the quality gates and accountability structure.

pith-pipeline@v0.9.0 · 5576 in / 1685 out tokens · 56352 ms · 2026-05-12T03:58:26.311489+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

  1. [1]

    security: runHuntingQuery,

    Microsoft, “security: runHuntingQuery,” Microsoft Graph REST API documentation, 2025. Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/graph/ api/security-security-runhuntingquery?view=graph-rest-1.0

  2. [2]

    Kusto Query Language overview,

    Microsoft, “Kusto Query Language overview,” Microsoft Learn, 2025. Accessed: May 1, 2026. [Online]. Available: https://learn.microsoft.com/en-us/kusto/query/?view= microsoft-fabric

  3. [3]

    Overview: Advanced hunting in Microsoft Defender XDR,

    Microsoft, “Overview: Advanced hunting in Microsoft Defender XDR,” Microsoft Learn,

  4. [4]

    [Online]

    Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/ defender-xdr/advanced-hunting-overview

  5. [5]

    Azure Monitor Log Analytics API overview,

    Microsoft, “Azure Monitor Log Analytics API overview,” Microsoft Learn, 2026. Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/azure/azure-monitor/ logs/api/overview

  6. [6]

    Run KQL queries on the Microsoft Sentinel data lake using APIs,

    Microsoft, “Run KQL queries on the Microsoft Sentinel data lake using APIs,” Microsoft Learn,

  7. [7]

    [Online]

    Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/ azure/sentinel/datalake/kql-queries-api

  8. [8]

    Kusto API overview,

    Microsoft, “Kusto API overview,” Microsoft Learn, 2026. Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/kusto/api/?view=microsoft-fabric

  9. [9]

    Design science in information systems research,

    A. R. Hevner, S. T. March, J. Park, and S. Ram, “Design science in information systems research,”MIS Quarterly, vol. 28, no. 1, pp. 75–105, 2004, doi: 10.2307/25148625

  10. [10]

    A design science research methodology for information systems research,

    K. Peffers, T. Tuunanen, M. A. Rothenberger, and S. Chatterjee, “A design science research methodology for information systems research,”Journal of Management Information Systems, vol. 24, no. 3, pp. 45–77, 2007, doi: 10.2753/MIS0742-1222240302

  11. [11]

    Raisch and S

    S. Raisch and S. Krakowski, “Artificial intelligence and management: The automation- augmentation paradox,”Academy of Management Review, vol. 46, no. 1, pp. 192–210, 2021, doi: 10.5465/amr.2018.0072

  12. [12]

    Managing artificial intelligence,

    N. Berente, B. Gu, J. Recker, and R. Santhanam, “Managing artificial intelligence,”MIS Quarterly, vol. 45, no. 3, pp. 1433–1450, 2021, doi: 10.25300/MISQ/2021/16274. 25

  13. [13]

    Understanding digital transformation: A review and a research agenda,

    G. Vial, “Understanding digital transformation: A review and a research agenda,”Journal of Strategic Information Systems, vol. 28, no. 2, pp. 118–144, 2019, doi: 10.1016/j.jsis.2019.01.003

  14. [14]

    Digital business strategy: Toward a next generation of insights,

    A. Bharadwaj, O. A. El Sawy, P. A. Pavlou, and N. Venkatraman, “Digital business strategy: Toward a next generation of insights,”MIS Quarterly, vol. 37, no. 2, pp. 471–482, 2013, doi: 10.25300/MISQ/2013/37:2.3

  15. [15]

    Explicating dynamic capabilities: The nature and microfoundations of sustainable enterprise performance,

    D. J. Teece, “Explicating dynamic capabilities: The nature and microfoundations of sustainable enterprise performance,”Strategic Management Journal, vol. 28, no. 13, pp. 1319–1350, 2007, doi: 10.1002/smj.640

  16. [16]

    Artificial intelligence and the future of work: Human-AI symbiosis in or- ganizational decision making,

    M. H. Jarrahi, “Artificial intelligence and the future of work: Human-AI symbiosis in or- ganizational decision making,”Business Horizons, vol. 61, no. 4, pp. 577–586, 2018, doi: 10.1016/j.bushor.2018.03.007

  17. [17]

    Working and organizing in the age of the learn- ing algorithm,

    S. Faraj, S. Pachidi, and K. Sayegh, “Working and organizing in the age of the learn- ing algorithm,”Information and Organization, vol. 28, no. 1, pp. 62–70, 2018, doi: 10.1016/j.infoandorg.2018.02.005

  18. [18]

    Digital innovation management: Reinventing innovation management research in a digital world,

    S. Nambisan, K. Lyytinen, A. Majchrzak, and M. Song, “Digital innovation management: Reinventing innovation management research in a digital world,”MIS Quarterly, vol. 41, no. 1, pp. 223–238, 2017, doi: 10.25300/MISQ/2017/41:1.03

  19. [19]

    From use to effective use: Arepresentation theory perspective,

    A. Burton-Jones and C. Grange, “From use to effective use: Arepresentation theory perspective,” Information Systems Research, vol. 24, no. 3, pp. 632–658, 2013, doi: 10.1287/isre.1120.0444

  20. [20]

    The third hand: IT-enabled competitive advantage in turbulence through improvisational capabilities,

    P. A. Pavlou and O. A. El Sawy, “The third hand: IT-enabled competitive advantage in turbulence through improvisational capabilities,”Information Systems Research, vol. 21, no. 3, pp. 443–471, 2010, doi: 10.1287/isre.1100.0280

  21. [21]

    Artificial intelligence risk management framework (AI RMF 1.0), NIST AI 100-1

    National Institute of Standards and Technology, “Artificial Intelligence Risk Management Framework (AI RMF 1.0),” NIST AI 100-1, 2023, doi: 10.6028/NIST.AI.100-1

  22. [22]

    Bennett, Kori 5 Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz

    S. Amershiet al., “Guidelines for human-AI interaction,” inProc. CHI Conf. Human Factors in Computing Systems, 2019, pp. 1–13, doi: 10.1145/3290605.3300233

  23. [23]

    Designing transparency for effective human-AI collaboration,

    M. Vössing, N. Kühl, M. Lind, and G. Satzger, “Designing transparency for effective human-AI collaboration,”Information Systems Frontiers, vol. 24, pp. 877–895, 2022, doi: 10.1007/s10796- 022-10284-3

  24. [24]

    Enhancing security operations center: Wazuh security event response with retrieval-augmented-generation-driven copilot,

    I. Ismail, R. Kurnia, and F. Widyatama, “Enhancing security operations center: Wazuh security event response with retrieval-augmented-generation-driven copilot,”Sensors, vol. 25, no. 3, p. 870, 2025, doi: 10.3390/s25030870

  25. [25]

    Generative AI for cyber security: Analyzing the potential of ChatGPT, DALL-E, and other models for enhancing the security space,

    S. Sai, U. Yashvardhan, V. Chamola, and B. K. Sikdar, “Generative AI for cyber security: Analyzing the potential of ChatGPT, DALL-E, and other models for enhancing the security space,”IEEE Access, vol. 12, pp. 53497–53516, 2024, doi: 10.1109/ACCESS.2024.3385107

  26. [26]

    Matched and mismatched SOCs: A qualitative study on security operations center issues,

    F. B. Kokulu, A. Soneji, T. Bao, Y. Shoshitaishvili, Z. Zhao, A. Doupé, and G.-J. Ahn, “Matched and mismatched SOCs: A qualitative study on security operations center issues,” in Proc. ACM SIGSAC Conf. Computer and Communications Security, 2019, pp. 1955–1970, doi: 10.1145/3319535.3354239. 26

  27. [27]

    Design science research contributions: Finding a balance between artifact and theory,

    R. Baskerville, A. Baiyere, S. Gregor, A. R. Hevner, and M. Rossi, “Design science research contributions: Finding a balance between artifact and theory,”Journal of the Association for Information Systems, vol. 19, no. 5, pp. 358–376, 2018, doi: 10.17705/1jais.00495

  28. [28]

    FEDS: A framework for evaluation in design science research,

    J. Venable, J. Pries-Heje, and R. Baskerville, “FEDS: A framework for evaluation in design science research,”European Journal of Information Systems, vol. 25, no. 1, pp. 77–89, 2016, doi: 10.1057/ejis.2014.36

  29. [29]

    Transparency in design science research,

    A. R. Hevner, J. Parsons, A. B. Brendel, R. Lukyanenko, V. Tiefenbeck, M. C. Tremblay, and J. vom Brocke, “Transparency in design science research,”Decision Support Systems, vol. 182, Art. no. 114236, 2024, doi: 10.1016/j.dss.2024.114236

  30. [30]

    Reliability in design science research,

    V. C. Storey, R. L. Baskerville, and M. Kaul, “Reliability in design science research,”Information Systems Journal, vol. 35, no. 3, pp. 984–1014, 2025, doi: 10.1111/isj.12564

  31. [31]

    Validity in design science,

    K. R. Larsen, R. Lukyanenko, R. M. Mueller, V. C. Storey, J. Parsons, D. VanderMeer, and D. S. Hovorka, “Validity in design science,”MIS Quarterly, vol. 49, no. 4, pp. 1267–1294, 2025, doi: 10.25300/MISQ/2024/18064

  32. [32]

    Best practices for Kusto Query Language queries,

    Microsoft, “Best practices for Kusto Query Language queries,” Microsoft Learn, 2026. Ac- cessed: May 1, 2026. [Online]. Available: https://learn.microsoft.com/en-us/kusto/ query/best-practices?view=microsoft-fabric

  33. [33]

    Threat intelligence in Microsoft Sentinel,

    Microsoft, “Threat intelligence in Microsoft Sentinel,” Microsoft Learn, 2025. Accessed: May 1, 2026. [Online]. Available:https://learn.microsoft.com/en-us/azure/sentinel/ understand-threat-intelligence

  34. [34]

    Secure Software Development Framework (SSDF) Version 1.1,

    National Institute of Standards and Technology, “Secure Software Development Framework (SSDF) Version 1.1,” NIST SP 800-218, 2022, doi: 10.6028/NIST.SP.800-218

  35. [35]

    Software Assurance Maturity Model,

    OWASP Foundation, “Software Assurance Maturity Model,” OWASP SAMM Project, 2026. Accessed: May 1, 2026. [Online]. Available:https://owaspsamm.org/

  36. [36]

    Cumulated gain-based evaluation of IR techniques,

    K. Järvelin and J. Kekäläinen, “Cumulated gain-based evaluation of IR techniques,”ACM Trans- actions on Information Systems, vol. 20, no. 4, pp. 422–446, 2002, doi: 10.1145/582415.582418

  37. [37]

    An introduction to ROC analysis,

    T. Fawcett, “An introduction to ROC analysis,”Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006, doi: 10.1016/j.patrec.2005.10.010

  38. [38]

    The relationship between precision-recall and ROC curves,

    J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves,” in Proc. 23rd Int. Conf. Machine Learning, 2006, pp. 233–240, doi: 10.1145/1143844.1143874

  39. [39]

    On calibration of modern neural networks,

    C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” inProc. 34th Int. Conf. Machine Learning, PMLR, vol. 70, 2017, pp. 1321–1330

  40. [40]

    OWASP Top 10 for Large Language Model Applications,

    OWASP Foundation, “OWASP Top 10 for Large Language Model Applications,” OWASP GenAI Security Project, 2026. Accessed: May 1, 2026. [Online]. Available:https://genai. owasp.org/llm-top-10/

  41. [41]

    Incident Response Recommendations and Considerations for Cybersecurity Risk Management,

    National Institute of Standards and Technology, “Incident Response Recommendations and Considerations for Cybersecurity Risk Management,” NIST SP 800-61 Rev. 3, 2025, doi: 10.6028/NIST.SP.800-61r3. 27

  42. [42]

    Explanation in artificial intelligence: Insights from the social sciences , shorttitle =

    T. Miller, “Explanation in artificial intelligence: Insights from the social sciences,”Artificial Intelligence, vol. 267, pp. 1–38, 2019, doi: 10.1016/j.artint.2018.07.007

  43. [43]

    Baird and L

    A. Baird and L. M. Maruping, “The next generation of research on IS use: A theoretical framework of delegation to and from agentic IS artifacts,”MIS Quarterly, vol. 45, no. 1, pp. 315–341, 2021, doi: 10.25300/MISQ/2021/15882

  44. [44]

    Algorithms at work: The new contested terrain of control,

    K. C. Kellogg, M. A. Valentine, and A. Christin, “Algorithms at work: The new contested terrain of control,”Academy of Management Annals, vol. 14, no. 1, pp. 366–410, 2020, doi: 10.5465/annals.2018.0174

  45. [45]

    Positioning and presenting design science research for maximum impact,

    S. Gregor and A. R. Hevner, “Positioning and presenting design science research for maximum impact,”MIS Quarterly, vol. 37, no. 2, pp. 337–355, 2013, doi: 10.25300/MISQ/2013/37.2.01. Author Biographies Elyson De La Cruz is a cybersecurity and information technology executive with over 20 years of experience in enterprise technology leadership, cyber defen...