AI-based clinical decision support for primary care: A real-world study

· 2025 · arXiv 2507.16947

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments

cs.AI · 2026-05-04 · conditional · novelty 8.0

PhysicianBench is a new benchmark of 100 physician-reviewed, execution-grounded tasks in live EHR environments where the best LLM agent reaches only 46% success and open-source models reach 19%.

Towards Conversational Medical AI with Eyes, Ears and a Voice

cs.AI · 2026-05-10 · conditional · novelty 6.0

AI co-clinician is a multimodal conversational AI that uses live audio-visual data for real-time medical reasoning in simulated telemedicine, approaching primary care physicians in management plans and differentials but lagging in physical exam and disease-specific tasks.

Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters

cs.AI · 2026-04-27 · unverdicted · novelty 6.0

Case-specific clinician rubrics for clinical AI notes achieve strong discrimination between outputs, high stability, and clinician-LLM agreement matching clinician-clinician levels at far lower cost.

Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight

cs.AI · 2025-12-22 · conditional · novelty 6.0

Physician oversight reveals high error rates in LLM-generated labels for a clinical benchmark and demonstrates that corrected labels improve both evaluation accuracy and downstream model training.

Benchmarking and Adapting On-Device LLMs for Clinical Decision Support

cs.CL · 2025-12-18 · conditional · novelty 5.0

Fine-tuned on-device LLMs achieve up to 87.9% diagnostic accuracy on clinical tasks, approaching GPT-5.1 at 89.4% while remaining smaller and local.

citing papers explorer

Showing 5 of 5 citing papers.

PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments cs.AI · 2026-05-04 · conditional · none · ref 17
PhysicianBench is a new benchmark of 100 physician-reviewed, execution-grounded tasks in live EHR environments where the best LLM agent reaches only 46% success and open-source models reach 19%.
Towards Conversational Medical AI with Eyes, Ears and a Voice cs.AI · 2026-05-10 · conditional · none · ref 6
AI co-clinician is a multimodal conversational AI that uses live audio-visual data for real-time medical reasoning in simulated telemedicine, approaching primary care physicians in management plans and differentials but lagging in physical exam and disease-specific tasks.
Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters cs.AI · 2026-04-27 · unverdicted · none · ref 11
Case-specific clinician rubrics for clinical AI notes achieve strong discrimination between outputs, high stability, and clinician-LLM agreement matching clinician-clinician levels at far lower cost.
Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight cs.AI · 2025-12-22 · conditional · none · ref 8
Physician oversight reveals high error rates in LLM-generated labels for a clinical benchmark and demonstrates that corrected labels improve both evaluation accuracy and downstream model training.
Benchmarking and Adapting On-Device LLMs for Clinical Decision Support cs.CL · 2025-12-18 · conditional · none · ref 7
Fine-tuned on-device LLMs achieve up to 87.9% diagnostic accuracy on clinical tasks, approaching GPT-5.1 at 89.4% while remaining smaller and local.

AI-based clinical decision support for primary care: A real-world study

fields

years

verdicts

representative citing papers

citing papers explorer