What disease does this patient have? a large-scale open domain question answering dataset from medical exams.Applied Sciences, 11(14):6421

Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, Peter Szolovits · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

MAGE: Multi-Agent Self-Evolution with Co-Evolutionary Knowledge Graphs

cs.AI · 2026-05-11 · unverdicted · novelty 7.0

MAGE uses a four-subgraph co-evolutionary knowledge graph plus dual bandits to externalize and retrieve experience for stable self-evolution of frozen language-model agents, showing gains on nine diverse benchmarks.

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

cs.AI · 2026-03-27 · unverdicted · novelty 6.0

XpertBench provides 1,346 rubric-scored expert tasks showing leading LLMs achieve a maximum ~66% success rate and ~55% mean score across domains.

citing papers explorer

Showing 2 of 2 citing papers.

MAGE: Multi-Agent Self-Evolution with Co-Evolutionary Knowledge Graphs cs.AI · 2026-05-11 · unverdicted · none · ref 10
MAGE uses a four-subgraph co-evolutionary knowledge graph plus dual bandits to externalize and retrieve experience for stable self-evolution of frozen language-model agents, showing gains on nine diverse benchmarks.
Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation cs.AI · 2026-03-27 · unverdicted · none · ref 7
XpertBench provides 1,346 rubric-scored expert tasks showing leading LLMs achieve a maximum ~66% success rate and ~55% mean score across domains.

What disease does this patient have? a large-scale open domain question answering dataset from medical exams.Applied Sciences, 11(14):6421

fields

years

verdicts

representative citing papers

citing papers explorer