pith. machine review for the scientific record. sign in

arxiv: 2604.22604 · v2 · submitted 2026-04-24 · 💻 cs.HC

Recognition: unknown

Vibe coding for clinicians: democratising bespoke software development for digital health innovation

Authors on Pith no claims yet

Pith reviewed 2026-05-08 10:32 UTC · model grok-4.3

classification 💻 cs.HC
keywords vibe codingcliniciansdigital healthlarge language modelssoftware prototypingclinical workflowsAI-assisted developmenthealth innovation
0
0 comments X

The pith

Vibe coding lets clinicians prototype bespoke digital health tools by prompting large language models in natural language.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Clinicians regularly encounter workflow problems too specific or low-volume to interest commercial developers. The paper claims that vibe coding, where clinicians describe needs in plain language to large language models, equips them to build or prototype their own solutions. This bridges the gap between front-line clinical insight and technical execution without requiring prior programming expertise. The authors supply foundational guidance, a step-by-step playbook, real examples, and explicit guardrails to keep the process practical and safe. They present the method as a complement to professional developers rather than a substitute.

Core claim

Vibe coding refers to the co-development of software using natural language prompts to large language models. The paper establishes that this approach democratises bespoke software development for clinicians, enabling them to rapidly create simple tools or prototypes that address real-world pain points and produce digital health solutions most reflective of clinical realities.

What carries the argument

Vibe coding, defined as co-development of software through natural language prompts to large language models, which carries the argument by turning clinical descriptions directly into functional prototypes.

If this is right

  • Clinicians without coding experience gain the ability to address bespoke workflow problems that commercial software ignores.
  • Rapid prototyping becomes accessible, allowing tools to be built that more closely match clinical realities than off-the-shelf options.
  • A shared playbook and case examples lower the barrier for early adopters to begin creating their own solutions.
  • Explicit caveats and guardrails are required to ensure safe deployment alongside professional developers.
  • The method positions clinicians as active participants in digital health innovation rather than passive users.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Widespread adoption would require new training on prompt engineering and code validation specific to regulated health environments.
  • Integration challenges with existing hospital IT systems and data privacy rules may limit how far prototypes can move into production.
  • Over-reliance on generated code without human oversight could create new liability questions for clinicians who deploy the results.
  • The approach may scale best for low-stakes tools and serve mainly as a discovery step before handing off to professional teams.

Load-bearing premise

Large language models can reliably generate functional, safe, and maintainable code for clinical applications from natural language prompts without introducing critical errors or compliance issues.

What would settle it

A documented case in which code generated through vibe coding produced a functional failure, patient safety issue, data breach, or regulatory violation in an actual clinical workflow would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.22604 by Ariel Yuhan Ong, Caroline Kilduff, David A Merle, Eden Ruffell, Fares Antaki, Iain Livingstone, Mertcan Sevgi, Pearse A Keane.

Figure 1
Figure 1. Figure 1: The vibe coding workflow illustrating the iterative cycle of software development with a view at source ↗
Figure 2
Figure 2. Figure 2: The user interface for a simple intravitreal injection interval calculator featuring two date view at source ↗
Figure 3
Figure 3. Figure 3: The “Cataract Calendar” tool featuring the online user interface which can generate a view at source ↗
read the original abstract

Clinicians often face workflow problems that are perceived as either too bespoke or low stakes to attract commercial attention. Historically, most do not have the technical knowledge to address these problems, but the recent emergence of "vibe coding" presents a transformative opportunity. Vibe coding refers to the co-development of software using natural language prompts to large language models. It offers a pathway to create simple tools that address these real-world pain points, or to prototype more complex ideas. In this review, written by a group of early adopter clinicians with a range of programming expertise, we introduce vibe coding for clinicians (especially those with no or minimal coding experience) as a way of democratising innovation from the front lines. We discuss foundational skills, outline some common challenges, provide a practical step-by-step playbook, and illustrate this approach with some case examples, taking care to consider caveats and guardrails for deployment. We propose that vibe coding is more than a technical shortcut for beginners and is not a replacement for professional software developers. Instead, it can bridge the gap between clinical insight and technical execution, equipping clinicians with the ability to rapidly prototype digital health solutions most reflective of clinical realities.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces 'vibe coding'—the co-development of software via natural-language prompts to large language models—as a means for clinicians (including those with minimal coding experience) to create bespoke digital health tools addressing workflow problems overlooked by commercial developers. Drawing on the authors' experience as early adopters, it covers foundational skills, common challenges, a practical step-by-step playbook, illustrative case examples, and caveats/guardrails, arguing that the approach can bridge clinical insight and technical execution to enable rapid prototyping of solutions most reflective of clinical realities.

Significance. If the core premise holds, the work could have meaningful significance for human-computer interaction and digital health by lowering barriers to clinician-led innovation in niche or low-stakes workflow areas. The manuscript's practical playbook, attention to guardrails, and grounding in real-world clinician experience provide a useful starting point for the community; however, its conceptual nature and lack of empirical validation mean the significance remains prospective rather than demonstrated.

major comments (2)
  1. [Abstract] Abstract: the central claim that vibe coding 'equips clinicians with the ability to rapidly prototype digital health solutions most reflective of clinical realities' is load-bearing for the paper's contribution yet rests on the untested assumption that LLM-generated code will be functional, safe, and maintainable; the manuscript provides no quantitative metrics on prompt success rates, error incidence, usability, or regulatory compliance, and the case examples remain qualitative anecdotes.
  2. [Case examples and guardrails] Sections on case examples and guardrails: while caveats are acknowledged, the absence of any controlled evaluation, post-deployment validation, or discussion of how clinicians would verify LLM outputs for clinical safety undermines the proposal's readiness for the digital health domain, where errors carry high stakes.
minor comments (2)
  1. [Playbook] The step-by-step playbook would be strengthened by including concrete prompt templates, example LLM responses, and failure modes to improve actionability for readers with no coding background.
  2. [Challenges] Additional citations to empirical studies on LLM code-generation reliability (especially in safety-critical or regulated domains) would better contextualize the challenges section.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thoughtful and constructive review. We appreciate the acknowledgment of the manuscript's potential significance as a practical starting point for clinician-led innovation. The paper is explicitly framed as a conceptual review and experiential guide by early adopters, not as an empirical study providing quantitative validation. We address the major comments below and outline targeted revisions to clarify scope and strengthen caveats without altering the core contribution.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that vibe coding 'equips clinicians with the ability to rapidly prototype digital health solutions most reflective of clinical realities' is load-bearing for the paper's contribution yet rests on the untested assumption that LLM-generated code will be functional, safe, and maintainable; the manuscript provides no quantitative metrics on prompt success rates, error incidence, usability, or regulatory compliance, and the case examples remain qualitative anecdotes.

    Authors: We agree that the manuscript offers no quantitative metrics or controlled data, as it draws from the authors' qualitative experiences as early adopters rather than experimental evaluation. The central claim is presented prospectively, based on observed patterns in real-world prototyping. We will revise the abstract to explicitly state that the described benefits are grounded in experiential insights and subject to the guardrails detailed in the main text, while removing any implication of demonstrated functionality or safety. The case examples will be reframed more clearly as illustrative anecdotes to demonstrate workflow applicability. revision: partial

  2. Referee: [Case examples and guardrails] Sections on case examples and guardrails: while caveats are acknowledged, the absence of any controlled evaluation, post-deployment validation, or discussion of how clinicians would verify LLM outputs for clinical safety undermines the proposal's readiness for the digital health domain, where errors carry high stakes.

    Authors: We concur that the paper lacks controlled evaluations or post-deployment data, which would be required to claim readiness for clinical deployment. The manuscript already limits its scope to low-stakes prototyping and explicitly states that vibe coding is not a substitute for professional software development or regulatory processes. The guardrails section discusses verification steps including manual code review, unit testing, and clinician oversight. We will expand this section with additional practical guidance on output verification (e.g., iterative prompting for edge cases and integration with existing clinical review workflows) and add a dedicated limitations paragraph calling for future empirical studies on safety and efficacy. revision: partial

standing simulated objections not resolved
  • Provision of quantitative metrics on prompt success rates, error incidence, usability, or regulatory compliance, as these require a separate empirical study outside the scope of this conceptual review.

Circularity Check

0 steps flagged

No circularity: conceptual review with no derivations or self-referential reductions

full rationale

The manuscript is a descriptive review and practical guide on vibe coding. It advances no mathematical models, equations, predictions, fitted parameters, or derivation chains. Claims rest on authors' stated experiences as early adopters and general discussion of LLM capabilities, without any step that reduces by construction to its own inputs or to a self-citation chain. No instances of self-definitional logic, fitted-input predictions, or ansatz smuggling appear. The paper is self-contained as advisory content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the untested assumption that current large language models can produce usable clinical software from natural language prompts and that clinicians can safely deploy the outputs with minimal oversight.

axioms (1)
  • domain assumption Large language models can generate functional and safe code for clinical tools from natural language descriptions provided by non-programmers.
    This premise underpins the entire proposal that vibe coding democratizes development; it is stated implicitly throughout the abstract.

pith-pipeline@v0.9.0 · 5531 in / 1197 out tokens · 72812 ms · 2026-05-08T10:32:03.476284+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 26 canonical work pages

  1. [1]

    vibe coding

    The CHUM School of Artificial Intelligence in Healthcare, Montreal, QC, Canada Correspondence to Ariel Yuhan Ong ariel.ong@nhs.net and Fares Antaki fares.antaki@outlook.com ABSTRACT Clinicians often face workflow problems that are perceived as either too bespoke or low stakes to attract commercial attention. Historically, most do not have the technical kn...

  2. [2]

    treat-and-extend

    maintainability, or writing clean, logical, and structured code that is easy to modify or debug in the future. Beginners can promote maintainability by instructing the LLM to break down the application into logical components and to use clear naming conventions. Treating documentation and versioning as habits rather than retrospective obligations is perha...

  3. [3]

    Electronic Health Record Stress and Burnout Among Clinicians in Hospital Settings: A Systematic Review

    Alobayli F, O’Connor S, Holloway A, Cresswell K. Electronic Health Record Stress and Burnout Among Clinicians in Hospital Settings: A Systematic Review. Digit Health. 2023;9:20552076231220241. doi:10.1177/20552076231220241

  4. [4]

    Bridging the Gap Between Healthcare Software Products and Sector-Based Requirements

    Kalpinagarajarao G. Bridging the Gap Between Healthcare Software Products and Sector-Based Requirements. Prog Med Sci. 2024;8(3):1-4. doi:10.47363/PMS/2024(8)224

  5. [5]

    The case for inclusive co-creation in digital health innovation

    Nickel GC, Wang S, Kwong JCC, Kvedar JC. The case for inclusive co-creation in digital health innovation. NPJ Digit Med. 2024;7:251. doi:10.1038/s41746-024-01256-9

  6. [6]

    Democratizing Code: How GPT and Large Language Models Are Reshaping the Landscape of Software Creation

    Ojha PR. Democratizing Code: How GPT and Large Language Models Are Reshaping the Landscape of Software Creation. International Journal of Scientific Research in Computer Science, Engineering and Information Technology. 2024;10(5):5. doi:10.32628/CSEIT241051031

  7. [7]

    Vibe coding: a new paradigm for biomedical software development

    Moore JH, Tatonetti N. Vibe coding: a new paradigm for biomedical software development. BioData Mining. 2025;18(1):1-3. doi:10.1186/s13040-025-00462-9

  8. [8]

    Democratizing artificial intelligence: How no-code AI can leverage machine learning operations

    Sundberg L, Holmström J. Democratizing artificial intelligence: How no-code AI can leverage machine learning operations. Business Horizons. 2023;66(6):777-788. doi:10.1016/j.bushor.2023.04.003

  9. [9]

    Vibe coding: Programming through conversa- tion with artificial intelligence,

    Sarkar A, Drosos I. Vibe coding: programming through conversation with artificial intelligence. arXiv. Preprint posted online June 29, 2025:arXiv:2506.23253. doi:10.48550/arXiv.2506.23253

  10. [10]

    AI Teaches Surgical Diagnostic Reasoning to Medical Students: Evidence from an Experiment Using a Fully Automated, Low-Cost Feedback System

    Kıyak YS, Emekli E, İş Kara T, Coşkun Ö, Budakoğlu Iİ. AI Teaches Surgical Diagnostic Reasoning to Medical Students: Evidence from an Experiment Using a Fully Automated, Low-Cost Feedback System. J Surg Educ. 2025;82(10):103639. doi:10.1016/j.jsurg.2025.103639

  11. [11]

    From technology adopters to creators: Leveraging AI-assisted vibe coding to transform clinical teaching and learning

    Chow M, Ng O. From technology adopters to creators: Leveraging AI-assisted vibe coding to transform clinical teaching and learning. Med Teach. Published online April 9, 2025:1-3. doi:10.1080/0142159X.2025.2488353

  12. [12]

    Innovation management of three-dimensional printing (3DP) technology: Disclosing insights from existing literature and determining future research streams

    Marić J, Opazo-Basáez M, Vlačić B, Dabić M. Innovation management of three-dimensional printing (3DP) technology: Disclosing insights from existing literature and determining future research streams. Technological Forecasting and Social Change. 2023;193:122605. doi:10.1016/j.techfore.2023.122605 Vibe coding for clinicians 18

  13. [13]

    3D Printing in a hospital: Centralized clinical implementation and applications for comprehensive care

    Hellman S, Frisch P, Platzman A, Booth P. 3D Printing in a hospital: Centralized clinical implementation and applications for comprehensive care. Digit Health. 2023;9:20552076231221899. doi:10.1177/20552076231221899

  14. [14]

    3D printed biomedical devices and their applications: A review on state-of-the-art technologies, existing challenges, and future perspectives

    Mamo HB, Adamiak M, Kunwar A. 3D printed biomedical devices and their applications: A review on state-of-the-art technologies, existing challenges, and future perspectives. Journal of the Mechanical Behavior of Biomedical Materials. 2023;143:105930. doi:10.1016/j.jmbbm.2023.105930

  15. [15]

    Roumeliotis, and Manoj Karkee

    Sapkota R, Roumeliotis KI, Karkee M. Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI. arXiv. Preprint posted online May 26, 2025:arXiv:2505.19443. doi:10.48550/arXiv.2505.19443

  16. [16]

    Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments

    Kruger J, Dunning D. Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. J Pers Soc Psychol. 1999;77(6):1121-1134. doi:10.1037//0022-3514.77.6.1121

  17. [17]

    The Double-Edged Sword of Anthropomorphism in LLMs

    Reinecke MG, Ting F, Savulescu J, Singh I. The Double-Edged Sword of Anthropomorphism in LLMs. Proceedings. 2025;114(1):4. doi:10.3390/proceedings2025114004

  18. [18]

    Agarwal, Y

    Agarwal V, Pei Y, Alamir S, Liu X. CodeMirage: Hallucinations in Code Generated by Large Language Models. arXiv. Preprint posted online July 8, 2025:arXiv:2408.08333. doi:10.48550/arXiv.2408.08333

  19. [19]

    Prompt engineering as an important emerging skill for med- ical professionals: tutorial,

    Meskó B. Prompt Engineering as an Important Emerging Skill for Medical Professionals: Tutorial. Journal of Medical Internet Research. 2023;25(1):e50638. doi:10.2196/50638

  20. [20]

    Intravitreal injections: past trends and future projections within a UK tertiary hospital

    Chopra R, Preston GC, Keenan TDL, et al. Intravitreal injections: past trends and future projections within a UK tertiary hospital. Eye. 2022;36(7):7. doi:10.1038/s41433-021-01646-3

  21. [21]

    A Systematic Review of the Treat and Extend Treatment Regimen with Anti-VEGF Agents for Neovascular Age-Related Macular Degeneration

    Gemenetzi M, Patel PJ. A Systematic Review of the Treat and Extend Treatment Regimen with Anti-VEGF Agents for Neovascular Age-Related Macular Degeneration. Ophthalmol Ther. 2017;6(1):79-92. doi:10.1007/s40123-017-0087-5

  22. [22]

    Electronic compliance monitoring of topical treatment after ophthalmic surgery

    Hermann MM, Ustündag C, Diestelhorst M. Electronic compliance monitoring of topical treatment after ophthalmic surgery. Int Ophthalmol. 2010;30(4):385-390. doi:10.1007/s10792-010-9362-3

  23. [23]

    Prevalence and correlates of self-reported nonadherence with eye drop treatment: the Belgian Compliance Study in Ophthalmology (BCSO)

    Vandenbroeck S, De Geest S, Dobbels F, Fieuws S, Stalmans I, Zeyen T. Prevalence and correlates of self-reported nonadherence with eye drop treatment: the Belgian Compliance Study in Ophthalmology (BCSO). J Glaucoma. 2011;20(7):414-421. doi:10.1097/IJG.0b013e3181f7b10e

  24. [24]

    Post-cataract prevention of inflammation and macular edema by steroid and nonsteroidal anti-inflammatory eye drops: a systematic review

    Kessel L, Tendal B, Jørgensen KJ, et al. Post-cataract prevention of inflammation and macular edema by steroid and nonsteroidal anti-inflammatory eye drops: a systematic review. Ophthalmology. 2014;121(10):1915-1924. doi:10.1016/j.ophtha.2014.04.035

  25. [25]

    A first look at license compliance capability of llms in code generation,

    Xu W, Gao K, He H, Zhou M. LiCoEval: Evaluating LLMs on License Compliance in Code Generation. arXiv. Preprint posted online February 25, 2025:arXiv:2408.02487. doi:10.48550/arXiv.2408.02487 Vibe coding for clinicians 19

  26. [26]

    DOE 1 et al v

    Inc DA. DOE 1 et al v. GitHub, Inc. et al, 4:22-cv-06823, No. 135-1 (N.D.Cal. Jul. 21, 2023). Docket Alarm. Accessed October 26,

  27. [27]

    The Paradoxes of Digital Tools in Hospitals: Qualitative Interview Study

    Wosny M, Strasser LM, Hastings J. The Paradoxes of Digital Tools in Hospitals: Qualitative Interview Study. J Med Internet Res. 2024;26:e56095. doi:10.2196/56095

  28. [28]

    Artificial intelligence in healthcare: Complementing, not replacing, doctors and healthcare providers

    Sezgin E. Artificial intelligence in healthcare: Complementing, not replacing, doctors and healthcare providers. Digit Health. 2023;9:20552076231186520. doi:10.1177/20552076231186520

  29. [29]

    Artificial Intelligence in the Provision of Health Care: An American College of Physicians Policy Position Paper

    Daneshvar N, Pandita D, Erickson S, Snyder Sulmasy L, DeCamp M, ACP Medical Informatics Committee and the Ethics, Professionalism and Human Rights Committee. Artificial Intelligence in the Provision of Health Care: An American College of Physicians Policy Position Paper. Ann Intern Med. 2024;177(7):964-967. doi:10.7326/M24-0146 Vibe coding for clinicians ...