EHRBench uses an EHR-LLM-KB pipeline to automatically create 960,067 reliable QA items spanning diagnosis, treatment, and prognosis for large-scale LLM evaluation in clinical decision making.
Dorfner, Amin Dada, Felix Busch, Mar- cus R
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Mistral uses careful lexical simplification to raise readability while keeping BERTScore at 0.91 comparable to humans, whereas QWen improves readability but shows a disconnect with its 0.89 BERTScore in biomedical text simplification.
citing papers explorer
-
Making Knowledge Accessible: Divergent Readability-Accuracy Strategies of Mistral and QWen in Biomedical Text Simplification
Mistral uses careful lexical simplification to raise readability while keeping BERTScore at 0.91 comparable to humans, whereas QWen improves readability but shows a disconnect with its 0.89 BERTScore in biomedical text simplification.