An Approach for a Supporting Multi-LLM System for Automated Certification Based on the German IT-Grundschutz
Pith reviewed 2026-06-25 20:49 UTC · model grok-4.3
The pith
A multi-LLM system with HybridRAG supports semi-automated BSI IT-Grundschutz certification to meet NIS2-driven demand.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that an MLS architecture combining LLMs and KGs via HybridRAG can support the full sequence of IT-Grundschutz certification steps, thereby raising efficiency, lowering costs, and helping certifiers sustain quality under the increased volume created by NIS2.
What carries the argument
The Multi-LLM System (MLS) with HybridRAG, which routes domain tasks between LLMs and knowledge graphs to cover certification phases.
If this is right
- The architecture can process the larger number of companies now subject to certification under NIS2.
- Implementation and certification costs drop while certifiers retain oversight of quality.
- All listed process phases from assessment through realization receive automated assistance.
- Specialist time is redirected from routine tasks to higher-level review.
Where Pith is reading between the lines
- The same MLS pattern could be retargeted at other national or sector-specific security certification schemes.
- Smaller organizations might reach compliance thresholds faster once the system is production-ready.
- Workflow integration studies would be needed to measure actual time savings versus current manual practice.
- Error patterns observed in early deployments could guide targeted knowledge-graph expansions.
Load-bearing premise
The premise that LLMs plus knowledge graphs can reliably manage the intricate, domain-specific steps of IT-Grundschutz certification without frequent major errors that still require heavy human correction.
What would settle it
A controlled run on a standard IT-Grundschutz case in which the system outputs protection needs or measures that certified experts judge to be materially incorrect or incomplete.
Figures
read the original abstract
This paper presents a novel approach to perform semi-automated BSI IT-Grundschutz certification using a MultiLarge Language Model system (MLS) with Hybrid RetrievalAugmented Generation (HybridRAG). Facing the challenges of the Network and Information Security Directive 2 (NIS2) directive, a shortage of specialists, and high implementation costs, our MLS architecture aims to increase efficiency, reduce costs, and support certifiers in maintaining the quality of security concepts while meeting the increased demand for certifications of newly affected companies. The system combines Large Language Models (LLMs) and Knowledge Graphs (KGs) to support different phases of the certification process, including protection needs assessment, modeling, IT-Grundschutz check, measure consolidation, and subsequent realization. Our architecture addresses the growing demand for security concepts and offers an approach to handle the digital security challenges introduced by NIS2.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Multi-LLM System (MLS) with HybridRAG for semi-automated BSI IT-Grundschutz certification. It describes an architecture combining LLMs and knowledge graphs to support phases including protection-needs assessment, modeling, IT-Grundschutz checks, measure consolidation, and realization, with the goal of increasing efficiency and reducing costs to meet NIS2-driven demand.
Significance. If the architecture could be shown to perform the described tasks reliably with limited human oversight, the approach would address a practical bottleneck in scaling IT-security certifications amid specialist shortages.
major comments (1)
- [Abstract and system-architecture description] The central claim that the MLS + HybridRAG combination can reliably perform protection-needs assessment, modeling, IT-Grundschutz checks, and measure consolidation with limited human intervention is load-bearing yet unsupported. The manuscript provides only a high-level architecture description and offers no prototype implementation, test cases, accuracy/completeness metrics, error analysis, or comparison against expert-certified outputs (see Abstract and the sections describing component roles).
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The major comment correctly identifies that the manuscript is a high-level architectural proposal without empirical evaluation. We will revise the paper to clarify its scope as a conceptual contribution and to moderate claims accordingly.
read point-by-point responses
-
Referee: [Abstract and system-architecture description] The central claim that the MLS + HybridRAG combination can reliably perform protection-needs assessment, modeling, IT-Grundschutz checks, and measure consolidation with limited human intervention is load-bearing yet unsupported. The manuscript provides only a high-level architecture description and offers no prototype implementation, test cases, accuracy/completeness metrics, error analysis, or comparison against expert-certified outputs (see Abstract and the sections describing component roles).
Authors: We agree that the current manuscript presents only a conceptual architecture and does not include any prototype, test cases, metrics, or expert comparisons. The abstract and body describe an approach that 'aims to increase efficiency' and 'support certifiers,' rather than asserting proven reliability. To address the concern, we will revise the abstract, introduction, and architecture sections to explicitly frame the work as a proposed framework whose benefits remain to be validated through implementation. We will add a new subsection on limitations and future work that outlines the planned prototype development, evaluation against certified outputs, and collection of accuracy metrics. These changes will ensure the claims match the evidential content of the paper. revision: yes
Circularity Check
No circularity: high-level architecture proposal with no derivations or fitted parameters
full rationale
The manuscript is a conceptual system proposal describing a Multi-LLM architecture with HybridRAG for IT-Grundschutz certification tasks. It contains no equations, no parameter fitting, no predictions derived from data, and no self-citations that serve as load-bearing justifications for uniqueness or ansatzes. All claims are forward-looking architectural suggestions rather than reductions of outputs to inputs by construction. The derivation chain is therefore self-contained at the level of design description.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
L333, pp
European Parliament and Council of the European Union, “Directive (EU) 2022/2555 of the European Parliament and of the Council of 14 December 2022 on measures for a high common level of cybersecurity across the Union, amending Regulation (EU) No 910/2014 and Directive (EU) 2018/1972, and repealing Directive (EU) 2016/1148 (NIS 2 Directive),” Official Jour...
2022
-
[2]
Assessing the economic impact of EU initiatives on cybersecurity,
Frontier Economics, “Assessing the economic impact of EU initiatives on cybersecurity,” Jul. 2023. Available: https://www.frontier-economics.com/media/izyk5rgz/assessing-the- economic-cost-of-eu-initiatives-on-cybersecurity.pdf.(Accessed: 2025- 03-08)
2023
-
[3]
Annex – Implications of new cyber security measures in Germany,
Frontier Economics, “Annex – Implications of new cyber security measures in Germany,” Sep. 2023. Available: https://www.frontier-economics.com/media/zusb5lly/cost-impact- of-cyber-security-germany-080923-final.pdf.(Accessed: 2025-03-08)
2023
-
[4]
Navigating cy- bersecurity investments in the time of NIS 2,
European Union Agency for Cybersecurity (ENISA), “Navigating cy- bersecurity investments in the time of NIS 2,” ENISA, Jul. 2023. Available: https://www.enisa.europa.eu/news/navigating-cybersecurity- investments-in-the-time-of-nis-2.(Accessed: 2025-03-08)
2023
-
[5]
IT- Grundschutz-Kompendium,
Bundesamt f ¨ur Sicherheit in der Informationstechnik, “IT- Grundschutz-Kompendium,” Edition 2023. Available: https: //www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Grundschutz/IT- GS-Kompendium/IT Grundschutz Kompendium Edition2023.pdf. (Accessed: 2025-03-08)
2023
-
[6]
BSI, “BSI-Standards,” Bundesamt f ¨ur Sicherheit in der Information- stechnik. Available: https://www.bsi.bund.de/dok/6603458. (Accessed: 2025-03-08)
arXiv 2025
-
[7]
Rechtsrahmen Cybersicherheit,
Bundesministerium des Innern und f ¨ur Heimat, “Rechtsrahmen Cybersicherheit,” bmi.bund.de. Available: https://www.bmi.bund.de/ DE/themen/it-und-digitalpolitik/it-und-cybersicherheit/rechtsrahmen- cybersicherheit/rechtsrahmen-cybersicherheit-node.html. (Accessed: 2025-03-08)
2025
-
[8]
FAQ zu NIS-2,
Bundesamt f ¨ur Sicherheit in der Informationstechnik, “FAQ zu NIS-2,” BSI, 2023. Available: https://www.bsi.bund.de/DE/Themen/Regulierte- Wirtschaft/NIS-2-regulierte-Unternehmen/NIS-2-FAQ/FAQ-zu-NIS- 2 node.html. (Accessed: 2025-03-08)
2023
-
[9]
IT-Grundschutz- Kompendium: Hilfsmittel und Anwenderbeitr ¨age,
Bundesamt f ¨ur Sicherheit in der Informationstechnik, “IT-Grundschutz- Kompendium: Hilfsmittel und Anwenderbeitr ¨age,” BSI, 2023. Available:https://www.bsi.bund.de/DE/Themen/Unternehmen-und- Organisationen/Standards-und-Zertifizierung/IT-Grundschutz/IT- Grundschutz-Kompendium/Hilfsmittel-und-Anwenderbeitraege/ Recplast/recplast node.html. (Accessed: 2025-03-08)
2023
-
[10]
In: Medical Imaging with Deep Learning (MIDL)
K. Liu, F. Wang, Z. Ding, S. Liang, Z. Yu, and Y . Zhou, “A review of KG application scenarios in cyber security,” ArXiv preprint, vol. abs/2204.04769, Apr. 2022. Available: https://doi.org/10.48550/arXiv. 2204.04769
work page internal anchor Pith review doi:10.48550/arxiv 2022
-
[11]
Towards A Knowledge Graph-based Frame- work for Integrated Security and Safety Analysis in Digital Produc- tion Systems,
S. J. Kropatschek, K. Kurniawan, P. R. Bhosale, S. Hollerer, E. Kies- ling, and D. Winkler, “Towards A Knowledge Graph-based Frame- work for Integrated Security and Safety Analysis in Digital Produc- tion Systems,” in Proceedings of the ISWC 2023 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice co- located with 22nd Internationa...
2023
-
[12]
Enhancing Legal Compliance and Regulation Analysis with Large Language Models,
S. Hassani, “Enhancing Legal Compliance and Regulation Analysis with Large Language Models,” arXiv preprint arXiv:2404.17522, Apr. 2024. Available: https://doi.org/10.48550/arXiv.2404.17522. (Accessed: 2025- 03-08)
-
[13]
Gracenote.ai: Legal Generative AI for Regulatory Compliance,
J. Ioannidis, J. Harper, M. S. Quah, and D. Hunter, “Gracenote.ai: Legal Generative AI for Regulatory Compliance,” in Proceedings of the Third International Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA 2023), Braga, Portugal, Jun. 2023
2023
-
[14]
S. Sinha, “AI-Driven Regulatory Compliance: Transforming Financial Oversight through Large Language Models and Automation,” ResearchGate, Dec. 2022. Available: https: //www.researchgate.net/publication/388231248 AI-Driven Regulatory Compliance Transforming Financial Oversight through Large Language Models and Automation. (Accessed: 2025-03-08)
arXiv 2022
-
[15]
From Local to Global: A Graph RAG Approach to Query-Focused Summarization,
D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Chao, A. Mody, S. Truitt, and J. Larson, “From Local to Global: A Graph RAG Approach to Query-Focused Summarization,” arXiv preprint arXiv:2404.16130, Apr
-
[16]
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Available: https://doi.org/10.48550/arXiv.2404.16130. (Accessed: 2025-03-08)
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.16130 2025
-
[17]
B. Sarmah, B. Hall, R. Rao, S. Patel, S. Pasquali, and D. Mehta, “HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Aug- mented Generation for Efficient Information Extraction,” arXiv preprint arXiv:2408.04948, Aug. 2024
arXiv 2024
-
[18]
GraphRAG: Leveraging Graph- Based Efficiency to Minimize Hallucinations in LLM-Driven RAG for Finance Data,
M. Barry, G. Caillaut, P. Halftermeyer, R. Qader, M. Mouayad, D. Cariolaro, F. Le Deit, and J. Gesnouin, “GraphRAG: Leveraging Graph- Based Efficiency to Minimize Hallucinations in LLM-Driven RAG for Finance Data,” in Proceedings of the 2025 Conference on Generative AI and Knowledge (GenAIK), Jan. 2025, pp. 54–63
2025
-
[19]
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey,
G. Agrawal, T. Kumarage, Z. Alghamdi, and H. Liu, “Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 1: Long Papers), Jun. 2024, pp. 3947–3960
2024
-
[20]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention Is All You Need,” arXiv preprint arXiv:1706.03762, Jun. 2017
Pith/arXiv arXiv 2017
-
[21]
Chain-of-verification reduces hallucination in large language models,
Y . Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion, A. Jones et al., “Chain-of-verification reduces hallucination in large language models,” arXiv preprint arXiv:2309.11495, Sep. 2023
Pith/arXiv arXiv 2023
-
[22]
BERTScore: Evaluating Text Generation with BERT,
T. Zhang, V . Kishore, F. Wu, K. Q. Weinberger, and Y . Artzi, “BERTScore: Evaluating Text Generation with BERT,” in International Conference on Learning Representations (ICLR), Apr. 2020
2020
-
[23]
Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection,
J. Wu, S. Li, A. Deng, M. Xiong, and B. Hooi, “Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM ’23), ACM, New York, NY , USA, Sep. 2023, pp. 1–11
2023
-
[24]
Self-Preference Bias in LLM-as-a- Judge,
K. Wataoka, T. Takahashi, R. Ri, “Self-Preference Bias in LLM-as-a- Judge,” Submitted to ICLR 2025, Sep. 2024
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.