pith. sign in

arxiv: 2606.28364 · v1 · pith:W7LI75KTnew · submitted 2026-06-14 · 💻 cs.IR

LLM based Knowledge Graph Approach to Automating Medical Device Regulatory Compliance

Pith reviewed 2026-06-30 11:08 UTC · model grok-4.3

classification 💻 cs.IR
keywords knowledge graphlarge language modelregulatory compliancemedical devicesSPARQL queriesFDA regulationsautomated classificationOWL RDF
0
0 comments X

The pith

Regulatory knowledge from FDA documents is encoded in a knowledge graph that an LLM queries to classify devices and check compliance automatically.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a framework that extracts FDA medical device regulations into an OWL/RDF knowledge graph. Mistral 7B Instruct then generates SPARQL queries to reason over the graph, classify devices by risk class, and produce compliance reports. This targets the problem of extensive, cross-referenced documents that currently demand heavy manual effort from manufacturers. If successful, it would make compliance checks faster, more consistent, and interpretable while lowering costs for bringing devices to market.

Core claim

The authors claim that translating FDA regulations into a machine-processable OWL/RDF knowledge graph and using an LLM to dynamically generate SPARQL queries allows automated device classification into Class I, II, or III along with real-time regulatory evaluation, as shown in validated use cases that reduce manual review.

What carries the argument

An OWL/RDF knowledge graph storing regulatory knowledge, queried via SPARQL queries generated on demand by the Mistral 7B Instruct model to perform compliance reasoning.

Load-bearing premise

The knowledge graph must accurately represent all cross-referenced FDA regulations without omissions or errors, and the LLM must consistently produce accurate SPARQL queries even for complex compliance questions.

What would settle it

Running the system on a known medical device with established classification and compliance status, then checking if the output matches the official FDA determination or expert analysis.

Figures

Figures reproduced from arXiv: 2606.28364 by Karuna Pande Joshi, Subhankar Chattoraj.

Figure 1
Figure 1. Figure 1: Overview for automating medical device regulations of the proposed semantically rich KG and LLM based architecture [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: LLM-KG medical device Knowledge Graph top-Level Classes include Device, Subpart and Classification that are related via object properties such [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

Advanced medical devices increasingly rely on AI-driven frameworks to automate compliance processes, ensuring safety and efficacy while reducing regulatory burdens. In the United States, software-based medical devices, including those utilizing AI/ML models, are regulated by the FDA's Center for Devices and Radiological Health (CDRH) under the Code of Federal Regulations (CFR) Title 21. These regulations are extensive, cross-referenced documents that require significant human effort to parse, leading to high compliance costs for manufacturers. We propose a novel, semantically rich framework that extracts regulatory knowledge from FDA documents and translates it into a machine-processable format. Our system encodes regulatory knowledge into an OWL/RDF-based knowledge graph and uses the Mistral 7B Instruct model to dynamically generate SPARQL queries, perform compliance reasoning, and produce structured reports. This enables automated device classification (Class I, II, or III) and real-time regulatory evaluation. Validated through real-world use cases, our framework significantly reduces manual review effort, enhances interpretability, and accelerates time-to-market. The proposed approach integrates AI reasoning and semantic technologies to achieve scalable, transparent, and automated regulatory compliance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes an LLM-based framework that extracts knowledge from FDA CFR Title 21 documents into an OWL/RDF knowledge graph, employs the Mistral 7B Instruct model to dynamically generate SPARQL queries for compliance reasoning and device classification (Class I/II/III), and produces structured reports. It claims this enables automated, real-time regulatory evaluation and has been validated on real-world use cases with significant reduction in manual effort.

Significance. If the extraction and query-generation steps prove reliable on cross-referenced regulations, the approach could reduce compliance costs for medical-device manufacturers and improve interpretability via semantic technologies; however, the absence of any reported metrics or error analysis leaves the practical impact unquantified.

major comments (2)
  1. [Abstract] Abstract: the claim that the system was 'validated through real-world use cases' and 'significantly reduces manual review effort' is unsupported; the manuscript supplies no description of the use cases, extraction pipeline, inter-annotator agreement for KG construction, SPARQL query success rate, or any quantitative comparison against manual review.
  2. [Approach description (implied in abstract)] The central assumption that regulatory text is losslessly translated into the OWL/RDF graph and that Mistral 7B Instruct produces semantically correct SPARQL for dense cross-references is stated without evidence; no counter-example handling, query validation procedure, or fidelity metrics are provided.
minor comments (1)
  1. The manuscript should include a dedicated Methods or Evaluation section with explicit metrics (e.g., precision/recall on classification, query executability rate) before the validation claim can be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments identifying areas where additional detail and evidence would strengthen the manuscript. We address each point below and will revise accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the system was 'validated through real-world use cases' and 'significantly reduces manual review effort' is unsupported; the manuscript supplies no description of the use cases, extraction pipeline, inter-annotator agreement for KG construction, SPARQL query success rate, or any quantitative comparison against manual review.

    Authors: We agree the abstract claims require supporting detail. The manuscript will be revised to include an expanded description of the real-world use cases (currently summarized in Section 4), the extraction pipeline for KG construction from CFR Title 21, and any internal metrics collected on SPARQL query success. Inter-annotator agreement was not performed because KG population combined automated LLM extraction with targeted human review rather than multiple independent annotators; this will be clarified. Quantitative effort-reduction comparisons will be added based on the logged manual review times from the use cases. revision: yes

  2. Referee: [Approach description (implied in abstract)] The central assumption that regulatory text is losslessly translated into the OWL/RDF graph and that Mistral 7B Instruct produces semantically correct SPARQL for dense cross-references is stated without evidence; no counter-example handling, query validation procedure, or fidelity metrics are provided.

    Authors: The current text presents the framework but does not supply explicit fidelity metrics or counter-example analysis. We will add a dedicated subsection on query generation that includes (1) the prompt template and few-shot examples used with Mistral 7B Instruct, (2) the post-generation validation steps (syntax checking plus manual spot-checks on a subset of queries), and (3) representative counter-examples where the generated SPARQL required manual correction, together with how those cases were handled. Comprehensive end-to-end fidelity metrics across the full regulation set are not available from the original experiments and would require new annotation effort. revision: partial

Circularity Check

0 steps flagged

No circularity; system description relies on external FDA sources and standard LLM without self-referential reductions

full rationale

The paper describes a framework that extracts regulatory knowledge from external FDA CFR documents into an OWL/RDF knowledge graph and employs the off-the-shelf Mistral 7B Instruct model to generate SPARQL queries for compliance reasoning and device classification. No equations, derivations, fitted parameters, or predictions are present. No self-citations are used to justify uniqueness or load-bearing premises. Validation is asserted via real-world use cases without any internal fitting or renaming of results. The derivation chain is therefore self-contained against external benchmarks and exhibits no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.1-grok · 5729 in / 1078 out tokens · 34163 ms · 2026-06-30T11:08:14.823104+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 4 canonical work pages

  1. [1]

    Transparency of artificial intelligence/machine learning-enabled medical devices,

    A. A. Shick, C. M. Webber, N. Kiarashi, J. Weinberg, A. Deoras, N. Petrick, A. Saha, and M. C. Diamond, “Transparency of artificial intelligence/machine learning-enabled medical devices,”NPJ Digital Medicine, vol. 7, 2024

  2. [2]

    Diving deep onto discriminative ensemble of histological hashing & class-specific manifold learning for multi-class breast carcinoma taxonomy,

    S. Pratiher and S. Chattoraj, “Diving deep onto discriminative ensemble of histological hashing & class-specific manifold learning for multi-class breast carcinoma taxonomy,”ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1025–1029, 2018

  3. [3]

    Role of an automated deep learning algorithm for reliable screening of abnormality in chest radiographs: A prospective multicenter quality improvement study,

    A. Govindarajan, A. Govindarajan, S. Tanamala, S. Chattoraj, B. Reddy, R. Agrawal, D. Iyer, A. Srivastava, P. Kumar, and P. Putha, “Role of an automated deep learning algorithm for reliable screening of abnormality in chest radiographs: A prospective multicenter quality improvement study,”Diagnostics, vol. 12, 2022. [4]Wearable Medical Devices Statistics ...

  4. [4]

    The need for a system view to regulate artificial intelligence/machine learning-based software as medical device,

    S. Gerke, B. Babic, T. Evgeniou, and I. G. Cohen, “The need for a system view to regulate artificial intelligence/machine learning-based software as medical device,”NPJ Digital Medicine, vol. 3, 2020

  5. [5]

    United states food and drug administration regulation of clinical software in the era of artificial intelligence and machine learning,

    V . Singh, S. Cheng, A. C. Kwan, and J. E. Ebinger, “United states food and drug administration regulation of clinical software in the era of artificial intelligence and machine learning,”Mayo Clinic Proceedings: Digital Health, 2025. [7]U.S. Food and Drug Administration., 2020 (accessed February, 2025). [Online]. Available: http://www.fda.gov/AboutFDA/Wh...

  6. [6]

    Demystifying the u.s. food and drug administration: Understanding regulatory pathways,

    N. Naghshineh, S. Brown, P. S. Cederna, B. Levi, J. L. Lisiecki, R. A. D’Amico, K. M. Hume, W. Seward, and J. P. Rubin, “Demystifying the u.s. food and drug administration: Understanding regulatory pathways,” Plastic and Reconstructive Surgery, vol. 134, p. 559–569, 2014

  7. [7]

    Ai-driven compliance for medical devices,

    Evalueserve, “Ai-driven compliance for medical devices,”IPRD Blog, 2024. [Online]. Available: https://iprd.evalueserve.com/blog/ ai-driven-compliance-for-medical-devices/

  8. [8]

    The digital revolution in regulatory affairs: Embracing ai and automation,

    RegDesk, “The digital revolution in regulatory affairs: Embracing ai and automation,” 2024. [Online]. Available: https://www.regdesk.co/ the-digital-revolution-in-regulatory-affairs-embracing-ai-and-automation/

  9. [9]

    arXiv preprint arXiv:2401.06775 , year=

    Z. A. Nazi and W. Peng, “Large language models in healthcare and medical domain: A review,”arXiv preprint arXiv:2401.06775, 2023. [Online]. Available: https://arxiv.org/abs/2401.06775

  10. [10]

    Mastering ai prompts for legal professionals in 2025,

    ContractPodAi, “Mastering ai prompts for legal professionals in 2025,” 2025. [Online]. Available: https://contractpodai.com/news/ ai-prompts-for-legal-professionals/

  11. [11]

    Rethinking legal compliance automation: Opportunities with large language models,

    S. Hassani, M. Sabetzadeh, D. Amyot, and J. Liao, “Rethinking legal compliance automation: Opportunities with large language models,”arXiv preprint arXiv:2404.14356, 2024. [Online]. Available: https://arxiv.org/abs/2404.14356

  12. [12]

    Code of federal regulations (cfr),

    “Code of federal regulations (cfr),” 2023, u.S. Government Publishing Office. [Online]. Available: https://www.ecfr.gov/

  13. [13]

    Federal register vs. cfr: What’s the difference?

    N. Archives, “Federal register vs. cfr: What’s the difference?” 2023. [Online]. Available: https://www.archives.gov/federal-register/tutorial/

  14. [14]

    Medical device regulations: Title 21, parts 800 to 1050,

    U. Food and D. Administration, “Medical device regulations: Title 21, parts 800 to 1050,” 2023. [Online]. Available: https: //www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/cfrsearch.cfm

  15. [15]

    Classify your medical device,

    FDA, “Classify your medical device,” 2023. [Online]. Avail- able: https://www.fda.gov/medical-devices/overview-device-regulation/ classify-your-medical-device

  16. [16]

    Administration,Premarket Approval (PMA),

    Food and D. Administration,Premarket Approval (PMA),

  17. [17]

    Available: https://www.fda.gov/medical-devices/ premarket-submissions/premarket-approval-pma

    [Online]. Available: https://www.fda.gov/medical-devices/ premarket-submissions/premarket-approval-pma

  18. [18]

    F. . D. Administration,Overview of Device Classification and Panels, 2023. [Online]. Available: https://www.fda.gov/medical-devices/ device-classification-overview

  19. [19]

    Administration,Exemptions from Pre- market Notification (510(k)), 2023

    Food and D. Administration,Exemptions from Pre- market Notification (510(k)), 2023. [Online]. Avail- able: https://www.fda.gov/medical-devices/premarket-notification-510k/ exemptions-premarket-notification-510k

  20. [20]

    F. . D. Administration,513(g) Request for Information, 2023. [Online]. Available: https://www.fda.gov/medical-devices/premarket-submissions/ 513g-submissions

  21. [21]

    Medreg-kg: Knowledgegraph for stream- lining medical device regulatory compliance,

    S. Chattoraj and K. P. Joshi, “Medreg-kg: Knowledgegraph for stream- lining medical device regulatory compliance,”2024 IEEE International Conference on Big Data (BigData), pp. 3382–3390, 2024

  22. [22]

    Fda approval process: Drugs vs. devices,

    J. Network, “Fda approval process: Drugs vs. devices,” 2022. [Online]. Available: https://jamanetwork.com/journals/jama/fullarticle/2673993

  23. [23]

    Delays in fda review and the impact on innovation,

    H. A. Blog, “Delays in fda review and the impact on innovation,” 2021. [Online]. Available: https://www.healthaffairs.org/do/10.1377/forefront. 20210730.797145/full/

  24. [24]

    Artificial intelligence in health care: Anticipating challenges to ethics, privacy, and bias,

    F. Jiang, Y . Jiang, H. Zhi, and Y . e. a. Dong, “Artificial intelligence in health care: Anticipating challenges to ethics, privacy, and bias,”The Lancet, vol. 395, no. 10228, p. 264–273, 2020

  25. [25]

    The rise of artificial intelligence in healthcare applications,

    E. J. Topol, “The rise of artificial intelligence in healthcare applications,” Nature Medicine, vol. 25, pp. 44–56, 2019

  26. [26]

    Artificial intelligence and machine learning in software as a medical device: Action plan,

    F. C. for Devices and R. Health, “Artificial intelligence and machine learning in software as a medical device: Action plan,” 2021. [Online]. Available: https://www.fda.gov/media/145022/download

  27. [27]

    Comparison of deep learning performance against healthcare professionals in detecting diseases from medical imaging: A systematic review,

    P. Rajpurkar, J. Irvin, and K. e. a. Zhu, “Comparison of deep learning performance against healthcare professionals in detecting diseases from medical imaging: A systematic review,”JAMA Network Open, vol. 2, no. 6, p. e197535, 2019

  28. [28]

    Semantically rich approach to automating regulations of medical devices,

    S. Chattoraj, R. Walid, and K. P. Joshi, “Semantically rich approach to automating regulations of medical devices,”2024 IEEE International Conference on Digital Health (ICDH), pp. 132–137, 2024

  29. [29]

    Rdf 1.1 concepts and abstract syntax,

    W. R. W. Group, “Rdf 1.1 concepts and abstract syntax,” 2014. [Online]. Available: https://www.w3.org/TR/rdf11-concepts/

  30. [30]

    Owl web ontology language overview,

    W. O. W. Group, “Owl web ontology language overview,” 2012. [Online]. Available: https://www.w3.org/TR/owl2-overview/

  31. [31]

    Semantic search and retrieval framework (sarf),

    F. CDRH, “Semantic search and retrieval framework (sarf),” 2022. [Online]. Available: https://www.fda.gov/media/150645/download

  32. [32]

    Rdf-based methods for detecting syntax and logic errors in medical datasets,

    Y . Kim and H. Jung, “Rdf-based methods for detecting syntax and logic errors in medical datasets,”Journal of Biomedical Semantics, vol. 12, pp. 1–13, 2021

  33. [33]

    Sparql query techniques for detecting regulatory inconsistencies in medical device databases,

    A. Takahashi and Y . Nakamura, “Sparql query techniques for detecting regulatory inconsistencies in medical device databases,” inProceedings of the International Conference on Health Informatics, 2020, pp. 45–52

  34. [34]

    Predicting adverse drug reactions using knowledge graph embeddings and deep learning,

    Y . Wang, P. Zhang, and H. Lv, “Predicting adverse drug reactions using knowledge graph embeddings and deep learning,”Artificial Intelligence in Medicine, vol. 113, p. 102045, 2021

  35. [35]

    A knowledge graph approach for classifying adverse drug reactions from heterogeneous data,

    X. Li, Z. Huang, and H. Yang, “A knowledge graph approach for classifying adverse drug reactions from heterogeneous data,”Journal of Biomedical Informatics, vol. 112, p. 103627, 2021

  36. [36]

    Constructing the patient safety knowledge graph for post-market surveillance,

    M. Chen and R. Xu, “Constructing the patient safety knowledge graph for post-market surveillance,”IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 5, pp. 2084–2093, 2022

  37. [37]

    Building a disease comorbidity knowledge graph from faers using association rule mining,

    L. Zhao and Y . Zhang, “Building a disease comorbidity knowledge graph from faers using association rule mining,”BMC Medical Informatics and Decision Making, vol. 21, p. 113, 2021

  38. [38]

    Evaluation of comorbidity detection in faers: A case study on psoriasis, ms, and obesity,

    R. Singh and J. Thomas, “Evaluation of comorbidity detection in faers: A case study on psoriasis, ms, and obesity,”Journal of Biomedical Research, vol. 35, no. 3, pp. 202–214, 2022