pith. sign in

arxiv: 2606.02812 · v1 · pith:T3LR4WPInew · submitted 2026-06-01 · 💻 cs.AI · cs.CL

Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

Pith reviewed 2026-06-28 14:17 UTC · model grok-4.3

classification 💻 cs.AI cs.CL
keywords multi-agent systemselectronic health recordslung cancer predictionpatient trajectory modelingreinforcement learningexperience retrieval
0
0 comments X

The pith

Traj-Evolve combines an experience pool for case retrieval with multi-agent reinforcement learning to improve lung cancer prediction from longitudinal EHR data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Traj-Evolve to address how existing LLM multi-agent systems handle each patient's EHR trajectory in isolation instead of drawing on experience from similar prior cases. It equips the system with an Experience Pool that stores and retrieves rejection-sampled reasoning traces as few-shot examples and with multi-agent reinforcement learning that fine-tunes collaboration between agents and memory. A leave-one-out cross-retrieval step keeps training and inference consistent under this retrieval augmentation. When tested on up to five years of multimodal records, the approach yields higher accuracy than nine baselines on both the overall population and the harder never-smoker subgroup.

Core claim

Traj-Evolve shows that a self-evolving multi-agent system, built from an Experience Pool of indexed reasoning traces and multi-agent reinforcement learning via reward-ranked fine-tuning, unified by leave-one-out cross-retrieval, can model sparse longitudinal EHR sequences more effectively than isolated processing, producing better lung cancer risk predictions on both general and never-smoker populations.

What carries the argument

The dual evolution of the Experience Pool (non-parametric retrieval of similar-patient traces) and multi-agent reinforcement learning (parametric optimization of agent-memory collaboration), aligned by leave-one-out cross-retrieval.

If this is right

  • Expanding the Experience Pool shifts optimal retrieval toward more specific rather than diverse samples.
  • Under the reinforcement learning step the manager agent's prediction loss converges quickly while worker agents continue to gain from additional verified patients.
  • The Experience Pool raises specificity of risk predictions while the reinforcement learning step raises sensitivity, and the two effects add constructively.
  • The performance advantage holds on the challenging never-smoker subpopulation as well as the overall population.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pair of non-parametric retrieval and parametric collaboration tuning might be tested on trajectory tasks for other chronic conditions that produce long EHR sequences.
  • The observed split between quick manager convergence and ongoing worker improvement suggests experiments that vary the number of verified patients supplied to each role separately.
  • If the complementarity between pool expansion and reinforcement learning persists, hybrid memory-plus-learning designs could be examined in non-medical multi-agent settings that also face long sparse contexts.

Load-bearing premise

The leave-one-out cross-retrieval strategy successfully aligns training-time and inference-time behavior under retrieval augmentation.

What would settle it

Removing the leave-one-out cross-retrieval step and checking whether the reported gains over the nine baselines disappear on the lung cancer prediction task using five-year multimodal EHRs.

Figures

Figures reproduced from arXiv: 2606.02812 by Matthew Thompson, Meliha Yetisgen, Ruth Etzioni, Sihang Zeng.

Figure 1
Figure 1. Figure 1: Overview of the Traj-Evolve architecture and self-evolving workflow. The top panel illustrates the self-evolving process, wherein the system accumu￾lates experience from prior verified patients to iteratively update Traj-Evolve and facilitate prediction for a new patient. The bottom panel details the pipeline. 2 [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A). ExPool equips Traj-Evolve with a non-parametric procedural memory that saves cer￾tain reasoning traces of verified patients. As Traj￾Evolve generates predictions that can be subse￾quently verified against ground-truth diagnostic sta￾tus (e.g., confirmed cancer diagnosis or benign sta￾tus), these verified reasoning traces can serve as ex￾perience for future cases. This design is analogous to expert clin… view at source ↗
Figure 4
Figure 4. Figure 4: Evolution of the ExPool. As ExPool size increases: A, average distance to retrieved neighbors de￾creases; B, index-neighbor Spearman correlations (age, risk score) increase; and C, retrieval purity (case status, sex, never-smoker) improves. Dashed lines in C indi￾cate random retrieval baselines. D, AUROC trajectories for k ∈ {5, 10, 15} retrieved patients (mean of 3 seeds). The dashed line denotes the base… view at source ↗
Figure 6
Figure 6. Figure 6: Optimization properties of Traj-Evolve’s self-evolving mechanisms. Density scattered plots com￾paring the predicted risk scores of Traj-Evolve variants against the static Traj-CoA baseline. Arrows illustrate the strength of how Traj-Evolve changes the scores over Traj-CoA (wider means stronger). Scattered points are presented in a jittered way to facilitate visualization. 6 Conclusion We present Traj-Evolv… view at source ↗
Figure 5
Figure 5. Figure 5: Evolving MARL performance during agent training. A, Loss curves for the worker and manager agents across training iterations. B, Model performance (mean of 3 seeds) as the number of training samples increases. 5.4 Mechanism Analysis ExPool and MARL exert complementary effects on the score distribution. Density plots of Traj￾Evolve risk scores against the static Traj-CoA (Fig￾ure 6) reveal distinct optimiza… view at source ↗
Figure 7
Figure 7. Figure 7: Case study demonstrating Traj-Evolve (ExPool+MARL)’s reasoning for a never-smoker patient. A UMAP [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

Modeling patient trajectories from longitudinal electronic health records (EHRs) requires reasoning over sparse, noisy, and long-context multimodal sequences. Existing LLM-based multi-agent systems address context length but process patients in isolation, failing to mirror how clinicians leverage accumulated experience from similar prior cases. We present Traj-Evolve, a self-evolving multi-agent system with two complementary evolving mechanisms. First, an Experience Pool (ExPool) acts as a non-parametric memory, indexing rejection-sampled reasoning traces to retrieve similar patients as few-shot contexts. Second, multi-agent reinforcement learning (MARL) via reward-ranked fine-tuning parametrically optimizes inter-agent and agent-memory collaboration. A leave-one-out cross-retrieval strategy unifies the two, aligning training- and inference-time behavior under retrieval augmentation. On a lung cancer prediction task utilizing up to five years of multimodal EHRs, Traj-Evolve outperforms 9 strong baselines on the overall population and a challenging never-smoker population. Analysis of the evolving dynamics highlights three key findings: (1) expanding the ExPool shifts optimal retrieval from diverse to specific samples; (2) under MARL, the manager agent's prediction loss converges quickly while the worker agents' temporal reasoning continues to benefit from more verified patients; and (3) the two mechanisms are complementary on the predicted risk, where ExPool improves specificity while MARL improves sensitivity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces Traj-Evolve, a self-evolving multi-agent LLM system for modeling patient trajectories from up to five years of multimodal longitudinal EHRs in lung cancer early detection. It combines a non-parametric Experience Pool (ExPool) that indexes rejection-sampled reasoning traces for few-shot retrieval of similar patients with multi-agent reinforcement learning (MARL) via reward-ranked fine-tuning to optimize inter-agent and agent-memory collaboration. These are unified by a leave-one-out cross-retrieval strategy claimed to align training-time and inference-time retrieval augmentation. The central empirical claim is outperformance over nine strong baselines on a lung cancer prediction task, for both the overall population and the never-smoker subgroup, with additional analysis of evolving dynamics showing complementarity (ExPool for specificity, MARL for sensitivity).

Significance. If the outperformance claim holds after verification that the leave-one-out strategy prevents leakage, the work would be significant as one of the first demonstrations of complementary non-parametric retrieval and parametric MARL evolution in a clinical multi-agent setting. The reported gains on the never-smoker cohort and the three findings on pool scaling, convergence, and risk complementarity could inform design of retrieval-augmented clinical LLMs.

major comments (1)
  1. [Abstract] Abstract (and §3.3 if present): the leave-one-out cross-retrieval strategy is presented as the mechanism that unifies ExPool and MARL so that training-time retrieval augmentation matches inference-time behavior, yet no details are supplied on pool construction timing, exclusion scope, or verification that no patient-level overlap or temporal leakage occurs across the five-year EHR windows. This alignment is load-bearing for the outperformance claim on both cohorts; without it the reported gains could arise from mismatched augmentation rather than the claimed complementarity.
minor comments (2)
  1. [Abstract] Abstract: quantitative results (AUC, sensitivity/specificity deltas, dataset size, exclusion criteria, baseline names, error bars) are absent, making it impossible to assess the magnitude or statistical significance of the claimed outperformance from the summary alone.
  2. The manuscript would benefit from an explicit statement of how the rejection sampling for ExPool traces is performed and whether the same reward model is used for both ExPool indexing and MARL ranking.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for this detailed and constructive comment on the leave-one-out strategy. We address it directly below and will revise the manuscript to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract] Abstract (and §3.3 if present): the leave-one-out cross-retrieval strategy is presented as the mechanism that unifies ExPool and MARL so that training-time retrieval augmentation matches inference-time behavior, yet no details are supplied on pool construction timing, exclusion scope, or verification that no patient-level overlap or temporal leakage occurs across the five-year EHR windows. This alignment is load-bearing for the outperformance claim on both cohorts; without it the reported gains could arise from mismatched augmentation rather than the claimed complementarity.

    Authors: We agree that the current manuscript provides insufficient implementation details on the leave-one-out cross-retrieval strategy, and that this is a substantive concern given its role in aligning training and inference. In the revised manuscript we will expand §3.3 (and the abstract) to explicitly describe: (i) pool construction timing (ExPool is built once on the training set before any MARL fine-tuning begins); (ii) exclusion scope (for every training example the current patient’s entire five-year record is removed from the retrieval pool, with patient IDs used to enforce this); and (iii) verification steps (patient-ID uniqueness checks across all splits plus manual inspection confirming no temporal overlap between a patient’s training windows and any retrieved examples). These additions will directly substantiate the no-leakage claim and thereby support the reported complementarity and performance gains. revision: yes

Circularity Check

0 steps flagged

No circularity: distinct mechanisms with empirical evaluation

full rationale

The paper describes two separate mechanisms—non-parametric ExPool retrieval of rejection-sampled traces and parametric MARL via reward-ranked fine-tuning—unified only by a leave-one-out cross-retrieval alignment strategy for training/inference consistency. No equations, fitted parameters, or derivations are shown that reduce the claimed outperformance (on overall and never-smoker cohorts) to a quantity defined by the same inputs or by self-citation. The central claim remains an empirical comparison against 9 baselines on multimodal EHR data, with the alignment strategy presented as an implementation detail rather than a self-defining fit. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no free parameters, axioms, or invented entities can be extracted or audited.

pith-pipeline@v0.9.1-grok · 5789 in / 1082 out tokens · 22717 ms · 2026-06-28T14:17:46.994765+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

78 extracted references · 20 canonical work pages · 8 internal anchors

  1. [1]

    CA: a cancer journal for clinicians , volume=

    Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , author=. CA: a cancer journal for clinicians , volume=. 2021 , publisher=

  2. [2]

    Journal of internal medicine , volume=

    Low-dose computed tomography lung cancer screening: Clinical evidence and implementation research , author=. Journal of internal medicine , volume=. 2022 , publisher=

  3. [3]

    JAMA oncology , volume=

    Update of incidence, prevalence, survival, and initial treatment in patients with non--small cell lung cancer in the US , author=. JAMA oncology , volume=

  4. [4]

    Prognostic Value of Symptoms at Lung Cancer Diagnosis: A Three-Year Observational Study , shorttitle =

    Polanco, Dinora and Pinilla, Luc. Prognostic Value of Symptoms at Lung Cancer Diagnosis: A Three-Year Observational Study , shorttitle =. Journal of Thoracic Disease , volume =. doi:10.21037/jtd-20-3075 , urldate =

  5. [5]

    New England Journal of Medicine , volume=

    Reduced lung-cancer mortality with low-dose computed tomographic screening , author=. New England Journal of Medicine , volume=. 2011 , publisher=

  6. [6]

    New England journal of medicine , volume=

    Reduced lung-cancer mortality with volume CT screening in a randomized trial , author=. New England journal of medicine , volume=. 2020 , publisher=

  7. [7]

    Jama , volume=

    Screening for lung cancer: US Preventive Services Task Force recommendation statement , author=. Jama , volume=

  8. [8]

    Cancer Epidemiology, Biomarkers & Prevention , volume=

    Examining lung cancer screening uptake in the United States: recent Research and limitations of public-use data , author=. Cancer Epidemiology, Biomarkers & Prevention , volume=. 2025 , publisher=

  9. [9]

    Contemporary Oncology , volume =

    Epidemiology of lung cancer , author =. Contemporary Oncology , volume =

  10. [10]

    JNCI: Journal of the National Cancer Institute , volume=

    Proportion of never-smoker non--small cell lung cancer patients at three diverse institutions , author=. JNCI: Journal of the National Cancer Institute , volume=. 2017 , publisher=

  11. [11]

    Nature reviews clinical oncology , volume=

    Lung cancer in patients who have never smoked—an emerging disease , author=. Nature reviews clinical oncology , volume=. 2024 , publisher=

  12. [12]

    Nature communications , volume=

    Inflammatory diseases and risk of lung cancer among individuals who have never smoked , author=. Nature communications , volume=. 2025 , publisher=

  13. [13]

    Jama , volume=

    Development and validation of risk models to select ever-smokers for CT lung cancer screening , author=. Jama , volume=

  14. [14]

    New England Journal of Medicine , volume=

    Selection criteria for lung-cancer screening , author=. New England Journal of Medicine , volume=. 2013 , publisher=

  15. [15]

    Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19

    Liao, Weiqi and Coupland, Carol AC and Burchardt, Judith and Baldwin, David R and Gleeson, Fergus and Baldwin, David and Batchkala, George and Buchanan, James and Chakraborty, Rohan and Chana, Ravi and others , journal=. Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19. 2023 ...

  16. [16]

    The Lancet Digital Health , volume=

    Evaluation of risk prediction models to select lung cancer screening participants in Europe: a prospective cohort consortium analysis , author=. The Lancet Digital Health , volume=. 2024 , publisher=

  17. [17]

    Journal of Clinical Oncology , volume=

    Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography , author=. Journal of Clinical Oncology , volume=. 2023 , publisher=

  18. [18]

    Radiology , volume=

    External testing of a deep learning model for lung cancer risk from low-dose chest CT , author=. Radiology , volume=. 2025 , publisher=

  19. [19]

    Nature Reviews Genetics , volume=

    Mining electronic health records: towards better research applications and clinical care , author=. Nature Reviews Genetics , volume=. 2012 , publisher=

  20. [20]

    Seminars in radiation oncology , volume=

    The evolving use of electronic health records (EHR) for research , author=. Seminars in radiation oncology , volume=. 2019 , organization=

  21. [21]

    Journal of biomedical informatics , volume=

    Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review , author=. Journal of biomedical informatics , volume=. 2021 , publisher=

  22. [22]

    NPJ digital medicine , volume=

    Scalable and accurate deep learning with electronic health records , author=. NPJ digital medicine , volume=. 2018 , publisher=

  23. [23]

    JMIR medical informatics , volume=

    Natural language processing of clinical notes on chronic diseases: systematic review , author=. JMIR medical informatics , volume=. 2019 , publisher=

  24. [24]

    Machine learning for healthcare conference , pages=

    Doctor ai: Predicting clinical events via recurrent neural networks , author=. Machine learning for healthcare conference , pages=. 2016 , organization=

  25. [25]

    Advances in neural information processing systems , volume=

    Retain: An interpretable predictive model for healthcare using reverse time attention mechanism , author=. Advances in neural information processing systems , volume=

  26. [26]

    Scientific reports , volume=

    BEHRT: transformer for electronic health records , author=. Scientific reports , volume=. 2020 , publisher=

  27. [27]

    NPJ digital medicine , volume=

    Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction , author=. NPJ digital medicine , volume=. 2021 , publisher=

  28. [28]

    The Lancet Digital Health , volume=

    Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study , author=. The Lancet Digital Health , volume=. 2024 , publisher=

  29. [29]

    arXiv preprint arXiv:2506.04831 , year=

    From ehrs to patient pathways: Scalable modeling of longitudinal health trajectories with llms , author=. arXiv preprint arXiv:2506.04831 , year=

  30. [30]

    Nature , volume=

    Large language models encode clinical knowledge , author=. Nature , volume=. 2023 , publisher=

  31. [31]

    NPJ digital medicine , volume=

    A large language model for electronic health records , author=. NPJ digital medicine , volume=. 2022 , publisher=

  32. [32]

    arXiv preprint arXiv:2402.01713 , year=

    Prompting large language models for zero-shot clinical prediction with structured longitudinal electronic health record data , author=. arXiv preprint arXiv:2402.01713 , year=

  33. [33]

    npj Digital Medicine , volume=

    Large language models forecast patient health trajectories enabling digital twins , author=. npj Digital Medicine , volume=. 2025 , publisher=

  34. [34]

    npj Digital Medicine , volume=

    Large language model trained on clinical oncology data predicts cancer progression , author=. npj Digital Medicine , volume=. 2025 , publisher=

  35. [35]

    arXiv preprint arXiv:2510.10454 , year=

    Traj-CoA: Patient Trajectory Modeling via Chain-of-Agents for Lung Cancer Risk Prediction , author=. arXiv preprint arXiv:2510.10454 , year=

  36. [36]

    Proceedings of the 10th Machine Learning for Healthcare Conference , year =

    TrajSurv: Learning Continuous Latent Trajectories from Electronic Health Records for Trustworthy Survival Prediction , author =. Proceedings of the 10th Machine Learning for Healthcare Conference , year =

  37. [37]

    Medical education , volume=

    What every teacher needs to know about clinical reasoning , author=. Medical education , volume=. 2005 , publisher=

  38. [38]

    The Cambridge handbook of thinking and reasoning , volume=

    Thinking and reasoning in medicine , author=. The Cambridge handbook of thinking and reasoning , volume=

  39. [39]

    arXiv preprint arXiv:2507.21046 , year=

    A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence , author=. arXiv preprint arXiv:2507.21046 , year=

  40. [40]

    ACM Transactions on Information Systems , volume=

    A survey on the memory mechanism of large language model-based agents , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=

  41. [41]

    Advances in neural information processing systems , volume=

    Reflexion: Language agents with verbal reinforcement learning , author=. Advances in neural information processing systems , volume=

  42. [42]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Expel: Llm agents are experiential learners , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  43. [43]

    arXiv preprint arXiv:2510.16079 , year=

    Evolver: Self-evolving llm agents through an experience-driven lifecycle , author=. arXiv preprint arXiv:2510.16079 , year=

  44. [44]

    arXiv preprint arXiv:2508.16153 , year=

    Memento: Fine-tuning llm agents without fine-tuning llms , author=. arXiv preprint arXiv:2508.16153 , year=

  45. [45]

    arXiv preprint arXiv:2304.06767 , year=

    Raft: Reward ranked finetuning for generative foundation model alignment , author=. arXiv preprint arXiv:2304.06767 , year=

  46. [46]

    arXiv preprint arXiv:2504.11343 , year=

    A minimalist approach to llm reasoning: from rejection sampling to reinforce , author=. arXiv preprint arXiv:2504.11343 , year=

  47. [47]

    Advances in Neural Information Processing Systems , volume=

    Coevolving with the other you: Fine-tuning llm with sequential cooperative multi-agent reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=

  48. [48]

    arXiv preprint arXiv:2504.16129 , year=

    Marft: Multi-agent reinforcement fine-tuning , author=. arXiv preprint arXiv:2504.16129 , year=

  49. [49]

    The Fourteenth International Conference on Learning Representations , year=

    Marti: A framework for multi-agent llm systems reinforced training and inference , author=. The Fourteenth International Conference on Learning Representations , year=

  50. [50]

    arXiv preprint arXiv:2405.02957 , year=

    Agent hospital: A simulacrum of hospital with evolvable medical agents , author=. arXiv preprint arXiv:2405.02957 , year=

  51. [51]

    International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

    MedAgentSim: Self-evolving Multi-agent Simulations for Realistic Clinical Interactions , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

  52. [52]

    arXiv preprint arXiv:2503.13856 , year=

    Mdteamgpt: A self-evolving llm-based multi-agent framework for multi-disciplinary team medical consultation , author=. arXiv preprint arXiv:2503.13856 , year=

  53. [53]

    npj Digital Medicine , volume=

    Timer: Temporal instruction modeling and evaluation for longitudinal clinical records , author=. npj Digital Medicine , volume=. 2025 , publisher=

  54. [54]

    Findings of ACL

    Large language models with temporal reasoning for longitudinal clinical summarization and prediction , author=. Findings of ACL. EMNLP. Conference on Empirical Methods in Natural Language Processing , volume=

  55. [55]

    arXiv preprint arXiv:2604.10386 , year=

    TrajOnco: a multi-agent framework for temporal reasoning over longitudinal EHR for multi-cancer early detection , author=. arXiv preprint arXiv:2604.10386 , year=

  56. [56]

    arXiv preprint arXiv:2507.06229 , year=

    Agent kb: Leveraging cross-domain experience for agentic problem solving , author=. arXiv preprint arXiv:2507.06229 , year=

  57. [57]

    arXiv.org , author =

  58. [58]

    arXiv.org , author =

    Nomic. arXiv.org , author =

  59. [59]

    Dong, Qingxiu and Li, Lei and Dai, Damai and Zheng, Ce and Ma, Jingyuan and Li, Rui and Xia, Heming and Xu, Jingjing and Wu, Zhiyong and Liu, Tianyu and Chang, Baobao and Sun, Xu and Li, Lei and Sui, Zhifang , month = oct, year =. A. doi:10.48550/arXiv.2301.00234 , abstract =

  60. [60]

    Retrieval-

    Xu, Zhentao and Cruz, Mark Jerome and Guevara, Matthew and Wang, Tie and Deshpande, Manasi and Wang, Xiaofeng and Li, Zheng , month = jul, year =. Retrieval-. Proceedings of the 47th. doi:10.1145/3626772.3661370 , abstract =

  61. [61]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and Küttler, Heinrich and Lewis, Mike and Yih, Wen-tau and Rocktäschel, Tim and Riedel, Sebastian and Kiela, Douwe , month = apr, year =. Retrieval-. doi:10.48550/arXiv.2005.11401 , abstract =

  62. [62]

    Zhang, Kaiyan and Zuo, Yuxin and He, Bingxiang and Sun, Youbang and Liu, Runze and Jiang, Che and Fan, Yuchen and Tian, Kai and Jia, Guoli and Li, Pengfei and Fu, Yu and Lv, Xingtai and Zhang, Yuchen and Zeng, Sihang and Qu, Shang and Li, Haozhan and Wang, Shijie and Wang, Yuru and Long, Xinwei and Liu, Fangfu and Xu, Xiang and Ma, Jiaze and Zhu, Xuekai a...

  63. [63]

    Xgboost: A scalable tree boosting system

    Chen, Tianqi and Guestrin, Carlos , month = aug, year =. Proceedings of the 22nd. doi:10.1145/2939672.2939785 , abstract =

  64. [64]

    Journal of Biomedical Informatics , author =

    Modelling patient trajectories using multimodal information , volume =. Journal of Biomedical Informatics , author =. 2022 , keywords =. doi:10.1016/j.jbi.2022.104195 , abstract =

  65. [65]

    and Wu, Anthony and Chiang, Jeffrey N

    Lee, Simon A. and Wu, Anthony and Chiang, Jeffrey N. , month = apr, year =. Clinical. doi:10.48550/arXiv.2504.03964 , abstract =

  66. [66]

    OpenAI and Agarwal, Sandhini and Ahmad, Lama and Ai, Jason and Altman, Sam and Applebaum, Andy and Arbus, Edwin and Arora, Rahul K. and Bai, Yu and Baker, Bowen and Bao, Haiming and Barak, Boaz and Bennett, Ally and Bertao, Tyler and Brett, Nivedita and Brevdo, Eugene and Brockman, Greg and Bubeck, Sebastien and Chang, Che and Chen, Kai and Chen, Mark and...

  67. [67]

    Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke , month = may, year =

  68. [68]

    Gu, Jiawei and Jiang, Xuhui and Shi, Zhichao and Tan, Hexiang and Zhai, Xuehao and Xu, Chengjin and Li, Wei and Shen, Yinghan and Ma, Shengjie and Liu, Honghao and Wang, Yuanzhuo and Guo, Jian , month = nov, year =. A. doi:10.48550/arXiv.2411.15594 , abstract =

  69. [69]

    Chain of

    Zhang, Yusen and Sun, Ruoxi and Chen, Yanfei and Pfister, Tomas and Zhang, Rui and Arik, Sercan Ö , month = jun, year =. Chain of. doi:10.48550/arXiv.2406.02818 , abstract =

  70. [70]

    and Chen, Guanhua and Afshar, Majid , month = dec, year =

    Gao, Jifan and Rahman, Mahmudur and Caskey, John and Oguss, Madeline and O’Rourke, Ann and Brown, Randall and Stey, Anne and Mayampurath, Anoop and Churpek, Matthew M. and Chen, Guanhua and Afshar, Majid , month = dec, year =. npj Digital Medicine , publisher =. doi:10.1038/s41746-025-02219-4 , abstract =

  71. [71]

    npj Digital Medicine , publisher =

    Li, Rumeng and Wang, Xun and Berlowitz, Dan and Mez, Jesse and Lin, Honghuang and Yu, Hong , month = aug, year =. npj Digital Medicine , publisher =. doi:10.1038/s41746-025-01940-4 , abstract =

  72. [72]

    STaR: Bootstrapping Reasoning With Reasoning

    Zelikman, Eric and Wu, Yuhuai and Mu, Jesse and Goodman, Noah D. , month = may, year =. doi:10.48550/arXiv.2203.14465 , abstract =

  73. [73]

    TTRL: Test-Time Reinforcement Learning

    Zuo, Yuxin and Zhang, Kaiyan and Qu, Shang and Sheng, Li and Zhu, Xuekai and Qi, Biqing and Sun, Youbang and Cui, Ganqu and Ding, Ning and Zhou, Bowen , month = apr, year =. doi:10.48550/arXiv.2504.16084 , abstract =

  74. [74]

    RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

    Wang, Zihan and Wang, Kangrui and Wang, Qineng and Zhang, Pingyue and Li, Linjie and Yang, Zhengyuan and Jin, Xing and Yu, Kefan and Nguyen, Minh Nhat and Liu, Licheng and Gottlieb, Eli and Lu, Yiping and Cho, Kyunghyun and Wu, Jiajun and Fei-Fei, Li and Wang, Lijuan and Choi, Yejin and Li, Manling , month = may, year =. doi:10.48550/arXiv.2504.20073 , abstract =

  75. [75]

    doi:10.48550/arXiv.2601.22964 , abstract =

    He, Yufei and Liu, Juncheng and Hu, Zhiyuan and Chen, Yulin and Liu, Yue and Sui, Yuan and Li, Yibo and Chen, Nuo and Hu, Jun and Hooi, Bryan and Xu, Xinxing and Bian, Jiang , month = jan, year =. doi:10.48550/arXiv.2601.22964 , abstract =

  76. [76]

    Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

    Liu, Nelson F. and Lin, Kevin and Hewitt, John and Paranjape, Ashwin and Bevilacqua, Michele and Petroni, Fabio and Liang, Percy , year =. Lost in the. Transactions of the Association for Computational Linguistics , publisher =. doi:10.1162/tacl_a_00638 , abstract =

  77. [77]

    Efficient Memory Management for Large Language Model Serving with PagedAttention

    Kwon, Woosuk and Li, Zhuohan and Zhuang, Siyuan and Sheng, Ying and Zheng, Lianmin and Yu, Cody Hao and Gonzalez, Joseph E. and Zhang, Hao and Stoica, Ion , month = sep, year =. Efficient. doi:10.48550/arXiv.2309.06180 , abstract =

  78. [78]

    Biometrical journal

    Youden. Biometrical journal. Biometrische Zeitschrift , author =. 2008 , pages =. doi:10.1002/bimj.200710415 , abstract =