EpiGraph creates a heterogeneous epilepsy knowledge graph that boosts LLM performance on clinical reasoning tasks by 30-41% in pharmacogenomics when used with Graph-RAG.
arXiv preprint arXiv:2504.00993 (2025)
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9roles
background 1polarities
background 1representative citing papers
RadThinking releases a large longitudinal CT VQA dataset stratified into foundation perception questions, single-rule reasoning questions, and compositional multi-step chains grounded in clinical reporting standards for cancer screening.
CLR-voyance reformulates inpatient reasoning as POMDP with clinician-validated outcome rubrics, yielding an 8B model that outperforms larger frontier models on the authors' new benchmark.
BioMedArena releases a standardized toolkit with 147 biomedical benchmarks, 75 tools, and six harnesses that achieve SOTA results on eight tasks with a +15.03 percentage point average lift.
MedSynapse-V evolves latent diagnostic memories via meta queries, causal counterfactual refinement with RL, and dual-branch memory transition to outperform prior medical VLM methods in diagnostic accuracy.
A new counterfactual multi-agent framework improves LLM diagnostic accuracy by quantifying confidence shifts from edited clinical findings and guiding specialist discussions.
ReMedi boosts LLM performance on EHR clinical predictions by up to 19.9% F1 through ground-truth-guided rationale regeneration and fine-tuning.
VeriLLMed is an interactive visual debugging tool that maps LLM diagnostic reasoning to knowledge graphs to identify and categorize relation, branch, and missing errors.
DeepER-Med introduces a three-module agentic AI workflow for evidence-based medical research that outperforms production platforms on a new expert-curated dataset of 100 questions and matches clinical recommendations in seven of eight real-world cases.
citing papers explorer
-
EpiGraph: Building Generalists for Evidence-Intensive Epilepsy Reasoning in the Wild
EpiGraph creates a heterogeneous epilepsy knowledge graph that boosts LLM performance on clinical reasoning tasks by 30-41% in pharmacogenomics when used with Graph-RAG.
-
RadThinking: A Dataset for Longitudinal Clinical Reasoning in Radiology
RadThinking releases a large longitudinal CT VQA dataset stratified into foundation perception questions, single-rule reasoning questions, and compositional multi-step chains grounded in clinical reporting standards for cancer screening.
-
CLR-voyance: Reinforcing Open-Ended Reasoning for Inpatient Clinical Decision Support with Outcome-Aware Rubrics
CLR-voyance reformulates inpatient reasoning as POMDP with clinician-validated outcome rubrics, yielding an 8B model that outperforms larger frontier models on the authors' new benchmark.
-
BioMedArena: An Open-source Toolkit for Building and Evaluating Biomedical Deep Research Agents
BioMedArena releases a standardized toolkit with 147 biomedical benchmarks, 75 tools, and six harnesses that achieve SOTA results on eight tasks with a +15.03 percentage point average lift.
-
MedSynapse-V: Bridging Visual Perception and Clinical Intuition via Latent Memory Evolution
MedSynapse-V evolves latent diagnostic memories via meta queries, causal counterfactual refinement with RL, and dual-branch memory transition to outperform prior medical VLM methods in diagnostic accuracy.
-
Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning
A new counterfactual multi-agent framework improves LLM diagnostic accuracy by quantifying confidence shifts from edited clinical findings and guiding specialist discussions.
-
ReMedi: Reasoner for Medical Clinical Prediction
ReMedi boosts LLM performance on EHR clinical predictions by up to 19.9% F1 through ground-truth-guided rationale regeneration and fine-tuning.
-
VeriLLMed: Interactive Visual Debugging of Medical Large Language Models with Knowledge Graphs
VeriLLMed is an interactive visual debugging tool that maps LLM diagnostic reasoning to knowledge graphs to identify and categorize relation, branch, and missing errors.
-
DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI
DeepER-Med introduces a three-module agentic AI workflow for evidence-based medical research that outperforms production platforms on a new expert-curated dataset of 100 questions and matches clinical recommendations in seven of eight real-world cases.