CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning

Eric Onyame , Akash Ghosh , Subhadip Baidya , Sriparna Saha , Xiuying Chen , Chirag Agarwal

Authors on Pith no claims yet

classification 💻 cs.AI cs.CL

keywords reasoningmultilinguallanguagemedicalcorrectnesscure-meddatasetlanguages

read the original abstract

While large language models (LLMs) have shown to perform well on monolingual mathematical and commonsense reasoning, they remain unreliable for multilingual medical reasoning applications, hindering their deployment in multilingual healthcare settings. We address this by first introducing CUREMED-BENCH, a high-quality multilingual medical reasoning dataset with open-ended reasoning queries with a single verifiable answer, spanning thirteen languages, including underrepresented languages such as Amharic, Yoruba, and Swahili. Building on this dataset, we propose CURE-MED, a curriculum-informed reinforcement learning framework that integrates code-switching-aware supervised fine-tuning and Group Relative Policy Optimization to jointly improve logical correctness and language stability. Across thirteen languages, our approach consistently outperforms strong baselines and scales effectively, achieving 85.21% language consistency and 54.35% logical correctness at 7B parameters, and 94.96% language consistency and 70.04% logical correctness at 32B parameters. These results support reliable and equitable multilingual medical reasoning in LLMs. The code and dataset are available at https://cure-med.github.io/

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources
cs.CL 2026-04 unverdicted novelty 7.0

A unified survey that consolidates Indian NLP resources by task, language, domain, and modality while identifying gaps in coverage and generalization.