Anchor-guided variance-aware reward modeling uses two response-level anchors to resolve non-identifiability in Gaussian models of pluralistic preferences, yielding provable identification, a joint training objective, and improved RLHF performance.
Nature , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5representative citing papers
CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.
MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.
LLMs under-recall rare and long-term radiation side effects in breast cancer, show prompt sensitivity, and improve when outputs are grounded in clinician-curated references.
A process-aware pipeline using logistic regression on prefix representations from 4,479 COVID-19 cases predicts ICU admission with AUC 0.906, improving from 0.642 early to 0.942 later in pathways.
citing papers explorer
-
Variance-aware Reward Modeling with Anchor Guidance
Anchor-guided variance-aware reward modeling uses two response-level anchors to resolve non-identifiability in Gaussian models of pluralistic preferences, yielding provable identification, a joint training objective, and improved RLHF performance.
-
CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents
CodeClinic benchmark demonstrates that LLM-generated Python skill libraries from clinical guidelines enhance consistency and reduce token consumption by up to 40% compared to zero-shot approaches on MIMIC-IV based tasks.
-
Medical Model Synthesis Architectures: A Case Study
MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.
-
Can Language Models Identify Side Effects of Breast Cancer Radiation Treatments?
LLMs under-recall rare and long-term radiation side effects in breast cancer, show prompt sensitivity, and improve when outputs are grounded in clinician-curated references.
-
From Data Lifting to Continuous Risk Estimation: A Process-Aware Pipeline for Predictive Monitoring of Clinical Pathways
A process-aware pipeline using logistic regression on prefix representations from 4,479 COVID-19 cases predicts ICU admission with AUC 0.906, improving from 0.642 early to 0.942 later in pathways.