pith. machine review for the scientific record. sign in

arxiv: 2605.00383 · v1 · submitted 2026-05-01 · 💻 cs.CL

Recognition: unknown

Agentic AI for Substance Use Education: Integrating Regulatory and Scientific Knowledge Sources

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:27 UTC · model grok-4.3

classification 💻 cs.CL
keywords agentic AIsubstance use educationretrieval-augmented generationregulatory knowledge integrationhealth education deliveryPubMed queriesexpert evaluation
0
0 comments X

The pith

An agentic AI system that merges fixed regulatory documents with live scientific literature can deliver accurate, context-sensitive substance use education.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs and evaluates a web application that uses retrieval-augmented generation to answer questions about substance use by drawing from a fixed set of 102 DEA regulatory documents stored in vector form and supplementing them with dynamic PubMed searches. This setup targets the practical limits of traditional education, where information quickly becomes outdated and personalized delivery is hard to scale. A sympathetic reader would see value in a method that keeps answers current and traceable to official sources while still being accessible to non-experts.

Core claim

The authors built an agentic AI application that combines a semantically chunked corpus of 102 DEA documents in a vector store with real-time PubMed queries to generate responses to substance-related questions. Five experts posed 30 domain questions, and independent raters scored 90 interactions on factual accuracy, citation quality, contextual coherence, and regulatory appropriateness, yielding mean scores between 4.18 and 4.35 with substantial inter-rater agreement. The work shows that such an architecture can produce transparent, context-sensitive educational content grounded in both regulatory and scientific sources.

What carries the argument

Retrieval-augmented generation pipeline that semantically chunks and vectorizes regulatory documents while adding dynamic literature queries to produce context-sensitive answers.

If this is right

  • The system can scale educational delivery beyond the reach of limited human instructors while keeping responses traceable to authoritative sources.
  • Dynamic PubMed integration allows information to stay current as new research appears without manual updates to the entire corpus.
  • High expert ratings on regulatory appropriateness indicate the outputs respect legal and policy constraints in addition to scientific facts.
  • The architecture supports verifiable health education that could be adapted to other regulated domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same retrieval pattern might reduce the spread of misinformation by anchoring answers in official records rather than general web content.
  • Longitudinal tracking of users could test whether exposure to cited, context-aware answers changes behavior more than static materials.
  • Extending the fixed corpus or search filters could address gaps for specific populations or emerging substances not well covered by current PubMed hits.
  • Agencies could deploy similar tools for public outreach, though that would require separate validation of coverage and bias.

Load-bearing premise

Ratings from five experts on thirty questions using a fixed 102-document corpus plus PubMed searches will accurately forecast how the system performs for varied real-world users and will cover the full evolving substance use domain without major omissions or bias.

What would settle it

A larger study with diverse users over months that checks whether the system misses new regulations or studies, produces consistently lower accuracy ratings, or fails to help users with real decisions would falsify the claim of scalable accuracy.

Figures

Figures reproduced from arXiv: 2605.00383 by Kosar Haghani, Mohammed Atiquzzaman, Zahra Kolagar.

Figure 1
Figure 1. Figure 1: Data processing pipeline showing the transformation of raw source materials into searchable view at source ↗
Figure 2
Figure 2. Figure 2: System architecture overview illustrating the five view at source ↗
Figure 3
Figure 3. Figure 3: System user interface showing conversational input field, data source sidebar, and chat history view at source ↗
Figure 4
Figure 4. Figure 4: The system reasoning process illustrates how the agentic orchestration layer analyzes query view at source ↗
Figure 5
Figure 5. Figure 5: Example system response to a query about cardiovascular effects of cocaine, demonstrating view at source ↗
Figure 6
Figure 6. Figure 6: Source attribution interface displaying both local DEA documents with relevance match view at source ↗
read the original abstract

The delivery of traditional substance education has remained problematic due to challenges in scalability, personalization, and the currency of information in a rapidly evolving substance use landscape. While artificial intelligence (AI) offers a promising frontier for enhancing educational delivery, its application in providing real-time, authoritative substance use education remains largely underexplored. We built an agentic-based AI web application that combined Drug Enforcement Administration records with peer-reviewed literature in real-time to provide transparent context-sensitive substance use education. The system uses retrieval-augmented generation with a carefully filtered corpus of 102 documents and dynamic PubMed queries. Document storage was semantically chunked and placed in a vector representation in order to be easily retrieved. We conducted an expert evaluation study in which a panel of five subject matter experts generated 30 domain-specific questions, and two independent raters assessed 90 system interactions (30 primary questions plus two contextual follow-ups each) using a five-point Likert scale across four criteria: factual accuracy, citation quality, contextual coherence, and regulatory appropriateness. Mean ratings ranged from 4.18 to 4.35 across the four criteria (overall category range: 4.05-4.52), with substantial inter-rater agreement (Cohen's kappa = 0.78). These findings suggest that agentic AI architectures integrating authoritative regulatory sources with real-time scientific literature represent a promising direction for scalable, accurate, and verifiable health education delivery, warranting further evaluation through longitudinal user studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

4 major / 1 minor

Summary. The manuscript describes the development of an agentic AI web application for substance use education that integrates a fixed corpus of 102 DEA regulatory documents with dynamic PubMed retrieval via retrieval-augmented generation and semantic chunking. A panel of five subject matter experts generated 30 domain-specific questions, producing 90 interactions (including two contextual follow-ups each) that were rated by two independent raters on a five-point Likert scale for factual accuracy, citation quality, contextual coherence, and regulatory appropriateness. Mean ratings ranged from 4.18 to 4.35 across criteria (overall range 4.05-4.52) with Cohen's kappa of 0.78. The authors conclude that such agentic architectures integrating authoritative regulatory sources with real-time scientific literature represent a promising direction for scalable, accurate, and verifiable health education delivery.

Significance. If the evaluation generalizes beyond the tested scope, the work offers a concrete demonstration of combining static authoritative regulatory corpora with live scientific retrieval in a high-stakes domain where information currency and verifiability are critical. The reported inter-rater agreement (kappa 0.78) and consistently high Likert means provide defensible support for system performance on the 90 interactions examined. The explicit call for longitudinal user studies in the conclusion is appropriate and strengthens the paper's positioning as an initial feasibility study rather than a definitive claim.

major comments (4)
  1. [Evaluation study] The 30 questions were generated by the same five subject matter experts who participated in the rating process. This design choice risks selection bias, as the questions may inadvertently favor the system's strengths; an independent or blinded question-generation protocol would be required to support the generalizability claim. (Evaluation study, Abstract)
  2. [Evaluation study] No baseline systems (standard search engines, non-agentic RAG, or human expert responses) are evaluated. Without comparative performance data, the mean ratings of 4.18-4.35 cannot be interpreted as evidence that the agentic architecture improves upon existing methods. (Evaluation study)
  3. [System architecture] The manuscript does not describe the selection or filtering criteria used to assemble the 102-document corpus, nor the precise retrieval, orchestration, and fallback logic for PubMed queries. These details are load-bearing for claims about coverage of the evolving substance-use domain and absence of systematic omissions or biases. (System architecture / Methods)
  4. [Evaluation study] The evaluation rests on only five experts and 30 questions (90 interactions). This limited, non-random sample is insufficient to underwrite the headline claim of a 'promising direction for scalable' delivery across diverse real-world users and emerging substances outside the static corpus. (Abstract, Evaluation study)
minor comments (1)
  1. [Abstract] The abstract states that 'two independent raters assessed 90 system interactions' but does not clarify whether these raters were distinct from the five experts who generated the questions; adding this detail would improve transparency.

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment point by point below, proposing revisions to improve clarity, transparency, and appropriate scoping of claims. Our work is framed as an initial feasibility study, consistent with the call for further longitudinal evaluation in the conclusion.

read point-by-point responses
  1. Referee: [Evaluation study] The 30 questions were generated by the same five subject matter experts who participated in the rating process. This design choice risks selection bias, as the questions may inadvertently favor the system's strengths; an independent or blinded question-generation protocol would be required to support the generalizability claim. (Evaluation study, Abstract)

    Authors: We acknowledge the risk of selection bias inherent in having the same experts generate questions and rate responses. The experts were asked to formulate questions reflecting authentic substance-use education scenarios drawn from their professional experience, without reference to system outputs during question creation. To address this concern, we will add an explicit limitations subsection in the revised manuscript discussing this design choice and recommending that future studies adopt independent or blinded question-generation protocols to enhance generalizability. revision: partial

  2. Referee: [Evaluation study] No baseline systems (standard search engines, non-agentic RAG, or human expert responses) are evaluated. Without comparative performance data, the mean ratings of 4.18-4.35 cannot be interpreted as evidence that the agentic architecture improves upon existing methods. (Evaluation study)

    Authors: We agree that the lack of baseline comparisons prevents direct claims of improvement over existing approaches. The current evaluation establishes absolute performance levels for this integrated agentic system, supported by high inter-rater reliability. We will revise the discussion to explicitly identify the absence of baselines as a limitation and state that future work should include comparative evaluations against standard search engines, non-agentic RAG, and human expert responses. revision: partial

  3. Referee: [System architecture] The manuscript does not describe the selection or filtering criteria used to assemble the 102-document corpus, nor the precise retrieval, orchestration, and fallback logic for PubMed queries. These details are load-bearing for claims about coverage of the evolving substance-use domain and absence of systematic omissions or biases. (System architecture / Methods)

    Authors: This is a valid point on methodological transparency. The revised manuscript will expand the Methods section to detail the corpus selection and filtering criteria (including relevance to DEA-controlled substances, recency, and exclusion rules), as well as the retrieval mechanisms, agent orchestration logic, semantic chunking parameters, and fallback procedures for PubMed queries. These additions will improve reproducibility and allow readers to assess potential coverage gaps or biases. revision: yes

  4. Referee: [Evaluation study] The evaluation rests on only five experts and 30 questions (90 interactions). This limited, non-random sample is insufficient to underwrite the headline claim of a 'promising direction for scalable' delivery across diverse real-world users and emerging substances outside the static corpus. (Abstract, Evaluation study)

    Authors: We present the study as a feasibility demonstration rather than a broad validation, as indicated by the conclusion's explicit call for longitudinal user studies. While the sample is limited, the 90 interactions yielded consistently high ratings and substantial agreement (kappa = 0.78). We will revise the abstract and discussion to more precisely qualify the scope, clarify that scalability and coverage of emerging substances require additional empirical testing with larger and more diverse cohorts, and avoid language that could imply immediate generalizability. revision: partial

Circularity Check

0 steps flagged

No circularity: claims rest on direct expert ratings of system outputs

full rationale

The paper describes a RAG-based agentic system built from a 102-document corpus plus PubMed retrieval, then reports an empirical evaluation in which five SMEs generated 30 questions and two independent raters scored 90 interactions on four Likert criteria, yielding means of 4.18–4.35 and kappa 0.78. No equations, fitted parameters, predictive models, or self-citations are invoked to derive the performance numbers; the ratings are presented as external measurements rather than quantities that reduce to the system's own inputs or prior author work by construction. The conclusion that the architecture is 'promising' is framed as an empirical observation warranting further study, not a derivation that collapses into its own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions from retrieval-augmented generation systems and the validity of small-scale expert ratings as a proxy for educational quality, without new fitted parameters or postulated entities.

axioms (1)
  • domain assumption Retrieval-augmented generation with semantic chunking and vector embeddings can accurately retrieve relevant regulatory and scientific documents for query answering.
    This underpins the system's ability to provide context-sensitive education without hallucination.

pith-pipeline@v0.9.0 · 5569 in / 1507 out tokens · 62865 ms · 2026-05-09T19:27:02.489491+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 36 canonical work pages · 3 internal anchors

  1. [1]

    Drug Enforcement Administration

    U.S. Drug Enforcement Administration. Drug fact sheets [Internet]. Washington (DC): DEA; 2024 [cited 2024 Dec 15]. Available from: https://www.dea.gov/factsheets

  2. [2]

    MEDLINE/PubMed database [Internet]

    National Library of Medicine. MEDLINE/PubMed database [Internet]. Bethesda (MD): NLM; 2024 [cited 2024 Dec 15]. Available from: https://www.nlm.nih.gov/medline/

  3. [3]

    PubMed [Internet]

    National Center for Biotechnology Information. PubMed [Internet]. Bethesda (MD): National Library of Medicine; 2024 [cited 2024 Dec 15]. Available from: https://pubmed.ncbi.nlm.nih.gov/

  4. [4]

    WHO global air quality guidelines: particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide.https://www

    World Health Organization. Global status report on alcohol and health and treatment of substance use disorders [Internet]. Geneva: WHO; 2023 [cited 2024 Dec 15]. Available from: https://www.who.int/publications/i/item/9789240074033

  5. [5]

    Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements

    Torous J, Nicholas J, Larsen ME, Firth J, Christensen H. Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements. Evid Based Ment Health. 2018;21(3):116-119. doi: 10.1136/eb-2018-102891

  6. [6]

    Time for dope

    Daniulaityte R, Nahhas RW, Wijeratne S, Carlson RG, Lamy FR, Martins SS, et al. "Time for dope": Analysis of Twitter data on emerging trends in drug use. J Med Internet Res. 2015;17(10):e240. doi: 10.2196/jmir.4597. Available from: https://doi.org/10.1177/10497323221142832

  7. [7]

    The role of internet emotional relationships, family and marital intimacy in predicting substance use relapse: A structural equation modeling study

    Khorrami M, Haghani K, Kholerdi FA, Ghavasi F, Vayani PH, Karkaragh FF. The role of internet emotional relationships, family and marital intimacy in predicting substance use relapse: A structural equation modeling study. Emerg Trends Drugs Addict Health. 2025;5:100183. doi: 10.1016/j.etdah.2025.100183

  8. [8]

    Available: https://arxiv.org/abs/2503.05777

    Kim J, et al. Understanding medical hallucinations: How LLM hallucinations impact patient safety [Preprint]. arXiv/Harvard Medical School; 2025.. Available from: https://arxiv.org/abs/2503.05777

  9. [9]

    Generative AI and large language models in mitigating medication-related harm

    National Institutes of Health. Generative AI and large language models in mitigating medication-related harm. NIH Scoping Review. Bethesda (MD): NIH; 2025.. Available from: https://www.ncbi.nlm.nih.gov/books/NBK607539/

  10. [10]

    The future of MedTech compliance: How dynamic data is transforming regulatory processes

    IQVIA. The future of MedTech compliance: How dynamic data is transforming regulatory processes. IQVIA Report. Durham (NC): IQVIA; 2025.. Available from: https://www.iqvia.com/insights/the-iqvia-institute/reports-and-publications/reports/the-future-of- medtech-compliance Artificial Intelligence and Substance Use Education 18

  11. [11]

    Stigmatizing language in large language models for alcohol and substance use disorders

    Wang Y, et al. Stigmatizing language in large language models for alcohol and substance use disorders. J Addict Med. 2025. In press.. Available from: https://doi.org/10.1097/ADM.0000000000001536

  12. [12]

    Dangers of LLM therapists: Stigma and safety in AI-driven mental health

    Haber N, Moore J. Dangers of LLM therapists: Stigma and safety in AI-driven mental health. In: Proceedings of the Stanford University/ACM Conference on Fairness & Accountability; 2025.. Available from: https://dl.acm.org/doi/proceedings/10.1145/3715275

  13. [13]

    Exploring the comparative efficacy of reality and paradox therapy in treating post- traumatic stress disorder in traumatized adolescents: An analytical review

    Asadiof F, Safarpour B, Barabadi S, Karkargh FF, Janbozorgi A, Khayayi R, Delshadi M, Haghani K. Exploring the comparative efficacy of reality and paradox therapy in treating post- traumatic stress disorder in traumatized adolescents: An analytical review. Contemporary Readings in Law and Social Justice. 2024;16(1):645-652

  14. [14]

    Retrieval-augmented generation (RAG) and LLMs for enterprise knowledge management

    MDPI. Retrieval-augmented generation (RAG) and LLMs for enterprise knowledge management. MDPI Review. Basel: MDPI; 2024.. Available from: https://doi.org/10.3390/app16010368

  15. [15]

    A systematic review of chatbot-assisted interventions for substance use

    Lee S, Yoon J, Cho Y, Chun J. A systematic review of chatbot-assisted interventions for substance use. Front Psychiatry. 2024;15:1456689. doi: 10.3389/fpsyt.2024.1456689

  16. [16]

    Large language models could change the future of behavioral healthcare: A proposal for responsible development and evaluation

    Stade EC, Stirman SW, Ungar LH, Boland CL, Schwartz HA, Yaden DB, Sedoc J, DeRubeis RJ, Willer R, Eichstaedt JC. Large language models could change the future of behavioral healthcare: A proposal for responsible development and evaluation. npj Mental Health Res. 2024;3:12. doi: 10.1038/s44184-024-00056-z

  17. [17]

    Applications and future prospects of medical LLMs: A survey based on the M-KAT conceptual framework

    Zhang L, Hou X, Jiang Y, Li M, Chen Z, Xu Q, Yang D, Wei J. Applications and future prospects of medical LLMs: A survey based on the M-KAT conceptual framework. J Med Syst. 2024;48:132. doi: 10.1007/s10916-024-02132-5

  18. [18]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Küttler H, Lewis M, Yih W, Rocktäschel T, Riedel S, Kiela D. Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020); 2020. p. 9459-9474.. Available from: https://arxiv.org/abs/2005.11401

  19. [19]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Gao Y, Xiong Y, Gao X, Jia K, Pan J, Bi Y, Dai Y, Sun J, Wang H. Retrieval-augmented generation for large language models: A survey [Preprint]. arXiv; 2023. arXiv:2312.10997.. Available from: https://arxiv.org/abs/2312.10997

  20. [20]

    Patient engagement with conversational agents in health applications 2016-2022: A systematic review and meta-analysis

    Cevasco KE, Morrison Brown RE. Patient engagement with conversational agents in health applications 2016-2022: A systematic review and meta-analysis. J Med Syst. 2024;48:94. doi: 10.1007/s10916-024-02059-x Artificial Intelligence and Substance Use Education 19

  21. [21]

    Chatbot for health care and oncology applications using artificial intelligence and machine learning: Systematic review

    Xu L, Sanders L, Li K, Chow JCL. Chatbot for health care and oncology applications using artificial intelligence and machine learning: Systematic review. JMIR Cancer. 2021;7(4):e27850. doi: 10.2196/27850

  22. [22]

    I-it, I-thou, I-robot: The perceived humanness of AI in human-machine communication

    Westerman D, Edwards AP, Edwards C, Luo Z, Spence PR. I-it, I-thou, I-robot: The perceived humanness of AI in human-machine communication. Commun Stud. 2024;71(3):393-

  23. [23]

    doi: 10.1080/10510974.2020.1749683

  24. [24]

    The use of chatbots as supportive agents for people seeking help with substance use disorder: A systematic review

    Ogilvie L, Prescott J, Carson J. The use of chatbots as supportive agents for people seeking help with substance use disorder: A systematic review. Eur Addict Res. 2022;28(6):405-418. doi: 10.1159/000525959

  25. [25]

    Evaluation of the performance of three large language models in clinical decision support: A comparative study based on actual cases

    Liu X, Zhang M, Wang Y, Chen H, Li Y. Evaluation of the performance of three large language models in clinical decision support: A comparative study based on actual cases. J Med Syst. 2025;49:15. doi: 10.1007/s10916-025-02152-9

  26. [26]

    ChatGPT: A conceptual review of applications and utility in the field of medicine

    Cascella M, Bellini V, Bignami E, Montomoli J. ChatGPT: A conceptual review of applications and utility in the field of medicine. J Med Syst. 2024;48:75. doi: 10.1007/s10916- 024-02075-x

  27. [27]

    Gender differences in sleep quality among Iranian traditional and industrial drug users

    Khorrami M, Khorrami F, Haghani K, Karkaragh FF, Khodashenas A, Souri S. Gender differences in sleep quality among Iranian traditional and industrial drug users. Neurobiol Sleep Circadian Rhythms. 2024;17:100104. doi: 10.1016/j.nbscr.2024.100104

  28. [28]

    Women's perceptions of gender inequality in the divorce process

    Haghani K, Williams JL, Sosa A. Women's perceptions of gender inequality in the divorce process. Journal of Public and Professional Sociology. 2025;17(1)

  29. [29]

    Drug use and artificial intelligence: Weighing concerns and possibilities for prevention

    Ezell JM, Ajayi BP, Parikh T, Kemp CG, Ompad DC. Drug use and artificial intelligence: Weighing concerns and possibilities for prevention. Am J Prev Med. 2024;66(3):559-565. doi: 10.1016/j.amepre.2023.10.007

  30. [30]

    Optimizing digital tools for the field of substance use and substance use disorders: Backcasting exercise

    Scheibein F, Caballeria E, Taher MA, Arya S, Bancroft A, Dannatt L, De Kock C, Chaudhary NI, Gayo RP, Ghosh A, Gelberg L, Goos C, Gordon R, Gual A, Hill P, Jeziorska I, Kurcevič E, Lakhov A, Maharjan I, Matrai S, Morgan N, Paraskevopoulos I, Puharić Z, Sibeko G, Stola J, Tiburcio M, Tay Wee Teck J, Tsereteli Z, López-Pelayo H. Optimizing digital tools for...

  31. [31]

    Use of a machine learning framework to predict substance use disorder treatment success

    Acion L, Kelmansky D, van der Laan M, Sahker E, Jones D, Arndt S. Use of a machine learning framework to predict substance use disorder treatment success. PLoS One. 2017;12(4):e0175383. doi: 10.1371/journal.pone.0175383 Artificial Intelligence and Substance Use Education 20

  32. [32]

    Machine learning-based outcome prediction and novel hypotheses generation for substance use disorder treatment

    Nasir M, Summerfield NS, Oztekin A, Knight M, Ackerson LK, Carreiro S. Machine learning-based outcome prediction and novel hypotheses generation for substance use disorder treatment. J Am Med Inform Assoc. 2021;28(6):1216-1224. doi: 10.1093/jamia/ocaa350

  33. [33]

    Improving treatment completion for young adults with substance use disorder: Machine learning-based prediction algorithms

    Tasnim M, Sahker E. Improving treatment completion for young adults with substance use disorder: Machine learning-based prediction algorithms. J Subst Abuse Treat. 2024;165:109425. doi: 10.1016/j.jsat.2024.109425

  34. [34]

    Craving for a robust methodology: A systematic review of machine learning algorithms on substance-use disorders treatment outcomes

    Uzuegbunam N, Wong WHT, Cheung JMY, Richi Nayak R. Craving for a robust methodology: A systematic review of machine learning algorithms on substance-use disorders treatment outcomes. Int J Ment Health Addict. 2024. doi: 10.1007/s11469-024-01403-z

  35. [35]

    Predictors of treatment attrition among individuals in substance use disorder treatment: A machine learning approach

    Rabinowitz JA, Wells JL, Kahn G, Ellis JD, Strickland JC, Hochheimer M, Huhn AS. Predictors of treatment attrition among individuals in substance use disorder treatment: A machine learning approach. Addict Behav. 2025;163:108265. doi:10.1016/j.addbeh.2025.108265

  36. [36]

    Evaluating generative AI responses to real-world drug-related questions

    Giorgi S, Isman K, Liu T, Fried Z, Sedoc J, Curtis B. Evaluating generative AI responses to real-world drug-related questions. Psychiatry Res. 2024;339:116058. doi: 10.1016/j.psychres.2024.116058

  37. [37]

    Integrating health literacy into a theory- based drug-use prevention program: A quasi-experimental study among junior high students in Taiwan

    Lin HW, Chen KY, Liao LL, Chang LC, Kao CC. Integrating health literacy into a theory- based drug-use prevention program: A quasi-experimental study among junior high students in Taiwan. BMC Public Health. 2021;21:1778. doi: 10.1186/s12889-021-11819-1

  38. [38]

    Effectiveness of a hybrid digital substance abuse prevention approach combining e-learning and in-person class sessions

    Williams CL, Botvin GJ, Griffin KW, Santana N, Dukarm J. Effectiveness of a hybrid digital substance abuse prevention approach combining e-learning and in-person class sessions. Front Public Health. 2022;10:917267. doi: 10.3389/fpubh.2022.917267

  39. [39]

    PyMuPDF4LLM (Version 0.0.1) [Computer software]

    Artifex Software. PyMuPDF4LLM (Version 0.0.1) [Computer software]. 2024. Available from: https://github.com/pymupdf/PyMuPDF4LLM

  40. [40]

    OCRmyPDF (Version 16.0.0) [Computer software]

    Mayer J. OCRmyPDF (Version 16.0.0) [Computer software]. 2023. Available from: https://github.com/ocrmypdf/OCRmyPDF

  41. [41]

    Docling (Version 2.0.0) [Computer software]

    IBM Research. Docling (Version 2.0.0) [Computer software]. 2024. Available from: https://github.com/DS4SD/docling

  42. [42]

    yt-dlp: Video downloader and caption extractor [Computer software]

    yt-dlp. yt-dlp: Video downloader and caption extractor [Computer software]. 2024. Available from: https://github.com/yt-dlp/yt-dlp

  43. [43]

    A Survey of Large Language Models

    Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, et al. A survey of large language models [Preprint]. arXiv; 2023. arXiv:2303.18223.. Available from: https://arxiv.org/abs/2303.18223 Artificial Intelligence and Substance Use Education 21

  44. [44]

    Qwen3-Embedding-8B model [Internet]

    Qwen Team. Qwen3-Embedding-8B model [Internet]. Hugging Face Model Hub; 2024 [cited 2024 Dec 15]. Available from: https://huggingface.co/Qwen/Qwen3-Embedding-8B

  45. [45]

    ChromaDB: Open-source vector database for RAG applications [Internet]

    ChromaDB. ChromaDB: Open-source vector database for RAG applications [Internet]. 2024 [cited 2024 Dec 15]. Available from: https://www.trychroma.com

  46. [46]

    Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs

    Malkov YA, Yashunin DA. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans Pattern Anal Mach Intell. 2018;42(4):824-836. doi: 10.1109/TPAMI.2018.2851828

  47. [47]

    Streamlit: Python framework for web-based machine learning applications [Internet]

    Streamlit. Streamlit: Python framework for web-based machine learning applications [Internet]. 2024 [cited 2024 Dec 15]. Available from: https://streamlit.io

  48. [48]

    LangChain: Building applications with LLMs through composability [Internet]

    LangChain, Inc. LangChain: Building applications with LLMs through composability [Internet]. 2024 [cited 2024 Dec 15]. Available from: https://www.langchain.com/

  49. [49]

    E-utilities API documentation [Internet]

    National Center for Biotechnology Information (NCBI). E-utilities API documentation [Internet]. Bethesda (MD): NCBI; 2023 [cited 2024 Dec 15]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK25501/

  50. [50]

    JSON-RPC 2.0 specification [Internet]

    JSON-RPC Working Group. JSON-RPC 2.0 specification [Internet]. 2013 [cited 2024 Dec 15]. Available from: https://www.jsonrpc.org/specification

  51. [51]

    In: Inui, K., Jiang, J., Ng, V., Wan, X

    Reimers N, Gurevych I. Sentence-BERT: Sentence embeddings using Siamese BERT- networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); 2019. p. 3982-3992. doi: 10.18653/v1/D19-1410

  52. [52]

    A comprehensive evaluation in medical curriculum using the Kirkpatrick hierarchical approach: A review and update

    Kusmiati M. A comprehensive evaluation in medical curriculum using the Kirkpatrick hierarchical approach: A review and update. Med Res Arch. 2025;13(5).. Available from: https://doi.org/10.18103/mra.v13i5.6557

  53. [53]

    Mapping and assessing existing digital skills training: A 4-level Kirkpatrick analysis

    Protogiros D, et al. Mapping and assessing existing digital skills training: A 4-level Kirkpatrick analysis. J Med Internet Res. 2025;27:e71657

  54. [54]

    AI chatbots and students' mental health support: An efficacy review

    Konadu BO, Kusi E. AI chatbots and students' mental health support: An efficacy review. Am J Educ Learn. 2025;10(2).. Available from: https://doi.org/10.55284/ajel.v10i2.1554

  55. [55]

    Artificial intelligence in virtual reality simulation for interprofessional communication training: mixed method study

    Liaw SY, Tan JZ, Lim S, Koh Y, Tan SC, Lee SY, Chua WL. Artificial intelligence in virtual reality simulation for interprofessional communication training: mixed method study. Nurse Educ Today. 2023;122:105718. doi: 10.1016/j.nedt.2023.105718