A new Gym environment for medical AI agents reveals collapse in multi-turn RL due to sparse rewards, addressed by Turn-level Truncated On-Policy Distillation yielding +3.9 pp gains on clinical benchmarks.
ACE inhibitors prevent bradykinin degradation (increase) and block Ang I→Ang II conversion (decrease)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Healthcare AI GYM for Medical Agents
A new Gym environment for medical AI agents reveals collapse in multi-turn RL due to sparse rewards, addressed by Turn-level Truncated On-Policy Distillation yielding +3.9 pp gains on clinical benchmarks.