Physician oversight reveals high error rates in LLM-generated labels for a clinical benchmark and demonstrates that corrected labels improve both evaluation accuracy and downstream model training.
Training LLMs for EHR-Based Reasoning Tasks via Rein- forcement Learning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.AI 2roles
background 1polarities
background 1representative citing papers
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
citing papers explorer
-
Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight
Physician oversight reveals high error rates in LLM-generated labels for a clinical benchmark and demonstrates that corrected labels improve both evaluation accuracy and downstream model training.
-
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.