Enhancing Clinical Trial Patient Matching through Knowledge Augmentation and Reasoning with Multi-Agent

Hanwen Shi; Jin Zhang; Kunpeng Zhang

arxiv: 2411.14637 · v5 · pith:ZSFR7FDXnew · submitted 2024-11-22 · 💻 cs.MA

Enhancing Clinical Trial Patient Matching through Knowledge Augmentation and Reasoning with Multi-Agent

Hanwen Shi , Jin Zhang , Kunpeng Zhang This is my paper

Pith reviewed 2026-05-23 16:58 UTC · model grok-4.3

classification 💻 cs.MA

keywords clinical trial matchingmulti-agent systemsknowledge augmentationstructured reasoningpatient matchingprivacy-preserving AI

0 comments

The pith

MAKAR uses multi-agent criterion augmentation and structured reasoning to raise patient-trial matching accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MAKAR as a multi-agent system that augments trial eligibility criteria with extra knowledge and then applies step-by-step reasoning across agents to decide matches. This setup is tested on multiple datasets and produces an average 7 percent gain in matching performance. The same architecture supports running the entire process locally so patient data never leaves the site. It also keeps results competitive when smaller open-source language models replace larger closed ones. If the gains hold, trial recruitment could become both more accurate and more private without requiring the biggest available models.

Core claim

MAKAR integrates criterion augmentation with structured reasoning inside a multi-agent framework to enhance the accuracy of patient-trial matching, achieving an average 7 percent performance improvement across datasets while supporting privacy-preserving deployment and competitive results with smaller open-source models.

What carries the argument

MAKAR, a multi-agent system that augments trial criteria and performs structured reasoning to decide patient matches.

If this is right

Patient-trial matching accuracy rises by an average of 7 percent on the tested datasets.
The system supports privacy-preserving deployment by keeping data local.
Performance remains competitive when smaller open-source models are used in place of larger ones.
The structured reasoning steps make the matching decisions more transparent.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same augmentation-plus-reasoning pattern could be tested on other eligibility or matching tasks inside hospitals.
Hospitals could run controlled trials on their own historical data to check whether the 7 percent gain appears on new populations.
If the gains generalize, trial recruitment teams might reduce the volume of manual chart reviews.

Load-bearing premise

The performance gains come from the multi-agent criterion augmentation and reasoning process rather than from patterns specific to the datasets used in the tests.

What would settle it

Applying MAKAR to a fresh collection of patient records and trial criteria drawn from a different hospital system and observing no accuracy improvement over standard matching methods.

Figures

Figures reproduced from arXiv: 2411.14637 by Hanwen Shi, Jin Zhang, Kunpeng Zhang.

**Figure 2.** Figure 2: The workflow of MAKAR framework provided in Appendix D Router Agent In MAKAR, multiple specialized augmentation agents are available to augment the criterion with domain-specific knowledge. The Router Agent determines the most suitable augmentation agent for the task and routes the workflow accordingly. Inspired by SELFREF (Kadavath et al., 2022) and MOREINFO (Feng et al., 2023), we adopt a prompt-based … view at source ↗

read the original abstract

Matching patients effectively and efficiently for clinical trials is a significant challenge due to the complexity and variability of patient profiles and trial criteria. This paper introduces \textbf{Multi-Agent for Knowledge Augmentation and Reasoning (MAKAR)}, a novel multi-agent system that enhances patient-trial matching by integrating criterion augmentation with structured reasoning. MAKAR consistently improves performance by an average of 7\% across different datasets. Furthermore, it enables privacy-preserving deployment and maintains competitive performance when using smaller open-source models. Overall, MAKAR can contributes to more transparent, accurate, and privacy-conscious AI-driven patient matching.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MAKAR applies multi-agent reasoning and criterion augmentation to patient-trial matching but the 7% gain claim rests on evaluation details that the abstract does not supply.

read the letter

The paper describes MAKAR, a multi-agent setup that augments trial criteria and runs structured reasoning steps to match patients to clinical trials. It reports an average 7% lift across datasets, plus the ability to use smaller open-source models and keep data local for privacy reasons. That combination of practical constraints is the part worth noting for anyone working on deployment in medical settings. The work applies known multi-agent patterns and knowledge augmentation rather than introducing a new primitive, so the contribution sits in the domain application. The privacy and small-model angles are useful reminders that real-world use often favors those properties over raw scale. The main gap is the lack of any concrete evaluation information in the abstract. There is no list of datasets, no baseline systems, no metric definitions, and no mention of statistical checks. Without those, the 7% figure cannot be assessed for robustness or generalizability. The full paper may contain the missing sections, but the current summary leaves the central performance claim unsupported by visible evidence. This kind of applied paper could interest groups focused on clinical NLP or trial operations who already follow multi-agent LLM work. It would be worth sending to referees only if the methods and results sections supply the standard experimental controls and comparisons; otherwise the manuscript is too thin to justify review time.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces MAKAR, a multi-agent system for clinical trial patient matching that performs criterion augmentation and structured reasoning. It claims this yields a consistent average 7% performance improvement across datasets, enables privacy-preserving deployment, and remains competitive when using smaller open-source models.

Significance. If the empirical results are reproducible and generalizable, the multi-agent architecture could offer a practical advance for AI-assisted trial matching by combining knowledge augmentation with reasoning while supporting privacy constraints. The approach is timely given regulatory interest in transparent and privacy-conscious medical AI, but the current presentation provides no basis to evaluate whether the claimed gains are robust.

major comments (2)

[Abstract] Abstract: the central claim that MAKAR 'consistently improves performance by an average of 7% across different datasets' is presented without any description of the datasets (size, diversity, train/test splits), baseline systems (single LLM, rule-based, or prior SOTA), metrics (exact-match accuracy, F1, AUC, etc.), or statistical tests. This information is load-bearing for the primary empirical contribution and its absence prevents verification of the result.
[Abstract] Abstract / Evaluation (wherever reported): no controls for confounding factors, no ablation of the criterion-augmentation versus multi-agent reasoning components, and no discussion of how privacy-preserving deployment is achieved (e.g., agent communication protocols or data locality). These omissions make it impossible to assess whether the reported gains are genuine or dataset-specific prompting artifacts.

minor comments (1)

[Abstract] Abstract: grammatical error ('MAKAR can contributes').

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below, clarifying where details appear in the full manuscript and indicating revisions to the abstract for improved clarity.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that MAKAR 'consistently improves performance by an average of 7% across different datasets' is presented without any description of the datasets (size, diversity, train/test splits), baseline systems (single LLM, rule-based, or prior SOTA), metrics (exact-match accuracy, F1, AUC, etc.), or statistical tests. This information is load-bearing for the primary empirical contribution and its absence prevents verification of the result.

Authors: The abstract is kept concise per typical constraints, but the full manuscript (Evaluation section) provides the requested details: dataset sizes and diversity (multiple clinical trial corpora with patient profiles), train/test splits, baselines (single-LLM prompting, rule-based systems, and prior SOTA), metrics (exact-match accuracy and F1), and statistical significance tests. We will revise the abstract to briefly reference these elements so the central claim is more self-contained. revision: yes
Referee: [Abstract] Abstract / Evaluation (wherever reported): no controls for confounding factors, no ablation of the criterion-augmentation versus multi-agent reasoning components, and no discussion of how privacy-preserving deployment is achieved (e.g., agent communication protocols or data locality). These omissions make it impossible to assess whether the reported gains are genuine or dataset-specific prompting artifacts.

Authors: The full manuscript contains ablation studies isolating criterion augmentation from multi-agent reasoning, controlled experiments addressing potential confounders, and explicit discussion of privacy via local inference of smaller models, secure agent protocols, and data locality (Methods and Architecture sections). We agree the abstract does not summarize these and will add a concise statement on ablations, controls, and privacy mechanisms to strengthen verifiability. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims with no derivation chain

full rationale

The paper introduces MAKAR as a multi-agent system and reports an average 7% performance improvement as an empirical observation across datasets. No mathematical derivation, first-principles result, or prediction step is claimed that could reduce to fitted inputs or self-citations by construction. The central claim is a measured outcome on evaluation sets rather than a tautological or self-referential derivation, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no equations, parameters, or technical mechanisms are described, so the ledger is empty.

pith-pipeline@v0.9.0 · 5619 in / 1067 out tokens · 59288 ms · 2026-05-23T16:58:09.009489+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

[1]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Compose: Cross-modal pseudo-siamese net- work for patient trial matching. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 803–812. Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shi- rong Ma, Peiyi Wang, Xiao Bi, and 1 others. 2025. Deepseek-r1: Incentivizing...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

advanced

Zero-shot clinical trial patient matching with llms. arXiv preprint arXiv:2402.05125. Xingyao Zhang, Cao Xiao, Lucas M Glass, and Jimeng Sun. 2020. Deepenroll: patient-trial matching with deep embedding and entailment prediction. In Pro- ceedings of the web conference 2020 , pages 1029– 1037. Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua ...

work page arXiv 2020
[3]

Locally advanced or metastatic breast cancer confirmed by histopathology

work page
[4]

Received 2 previous lines of anti-tumor treat- ment and developed resistance to standard treat- ment

work page
[5]

Patients with locally advanced or metastatic ad- vanced solid tumors confirmed by histology or cy- tology (including but not limited to triple-negative breast cancer, gastric cancer, colorectal cancer); Patients with 1 line of standard treatment failure (disease progression after treatment or intolerabil- ity of toxic side effects of treatment), or no sta...

work page
[6]

At least 1 measurable lesion per Response Eval- uation Criteria in Solid Tumors version 1.1 and previously treated lesions with radiotherapy or fo- cal therapy and no progression cannot be included as target lesion for assessment

work page
[7]

Histologically confirmed breast cancer with an invasive component measuring 20 mm and/or with morphologically confirmed spread to regional lymph nodes (stage cT2-cT4 with any cN, or cN1- cN3 with any cT)

work page
[8]

If IHC is 2+, a positive in situ hybridization (FISH, CISH, or SISH) test is required by local laboratory testing

Known HER2-positive breast cancer defined as an IHC status of 3+. If IHC is 2+, a positive in situ hybridization (FISH, CISH, or SISH) test is required by local laboratory testing

work page
[9]

Patients with breast cancer planned to receive Anthracycline and Cyclophosphamide chemother- apy

work page
[10]

Patients treated for hormone-dependent local- ized breast cancer requiring adjuvant hormonal therapy (HT)

work page
[11]

Patients treated with tamoxifen for a maximum of 1 to 3 years

work page
[12]

Histologically confirmed ER+/HER2- early- stage resected invasive breast cancer at high or intermediate risk of recurrence, based on clinical- pathological risk features, as defined in the proto- col

work page
[13]

Completed adequate (definitive) locoregional therapy (surgery with or without radiotherapy) 10 for the primary breast tumour(s), with or without (neo)adjuvant chemotherapy

work page
[14]

Completed at least 2 years but no more than 5 years (+3 months) of adjuvant ET (+/- CDK4/6 inhibitor)

work page
[15]

Patients receiving letrozole for more than two months

work page
[16]

Postmenopausal hormone receptor-positive pa- tients

work page
[17]

Exclusion Criteria

Patients with N0 disease or those who have undergone targeted axillary detection with a good response. Exclusion Criteria

work page
[18]

Patients known or suspected intolerance or hy- persensitivity to main ingredient or any of the ex- cipients of SNB-101

work page
[19]

Patients with bilateral invasive breast cancers

work page
[20]

Age <18 years old or >70 years old

work page
[21]

Patients with standard metallic contraindications to CMR or estimated glomerular filtration rate<30 mL/min/1.73 m²

work page
[22]

History of hypersensitivity or contraindication to TTF

work page
[23]

Implanted pacemaker, defibrillator, or other elec- trical medical devices

work page
[24]

All histological lesions were HER2 1+ or 2+, not detected by FISH amplification and HR positive according to ASCO guidelines)

Bilateral breast cancer (including multifocal breast cancer. All histological lesions were HER2 1+ or 2+, not detected by FISH amplification and HR positive according to ASCO guidelines)

work page
[25]

Received any form of anti-tumor therapy (chemotherapy, radiotherapy, molecular targeted therapy, endocrine therapy, etc.)

work page
[26]

Breast cancer without histopathological diagno- sis

work page
[27]

Known personal history of ductal carcinoma in situ (DCIS) or invasive breast cancer

work page
[28]

Prior systemic treatment for any malignancy

work page
[29]

Significant medical comorbidities as per inves- tigator evaluation

work page
[30]

Patients with ongoing indications for the car- dioprotective medication - ACE inhibitors, ARBs and/or beta-blockers

work page
[31]

Intermediate or high-grade ductal carcinoma in situ

work page
[32]

B.5 Example Criteria from ClinicalTrials.gov refined by GPT-4o Criterion: Locally advanced or metastatic breast cancer confirmed by histopathology

Invasive carcinoma. B.5 Example Criteria from ClinicalTrials.gov refined by GPT-4o Criterion: Locally advanced or metastatic breast cancer confirmed by histopathology. Additional Explanation: - Locally advanced breast cancer refers to cancer that has spread beyond the breast to nearby areas but not to distant body parts. It often involves large tumors or ...

work page
[33]

**Anti-tumor treatment**: This refers to any therapy aimed at treating cancer, which might include chemotherapy, targeted therapy, immunotherapy, or hormone therapy

work page
[34]

The patient must have undergone two different complete treatment sequences

**2 previous lines of treatment**: In cancer therapy, a "line" of treatment refers to a complete sequence of therapy given for cancer, which can consist of one or multiple medications or interventions. The patient must have undergone two different complete treatment sequences

work page
[35]

This can be defined by the tumor growing or not shrinking in size despite treatment, or the cancer markers take not being reduced

**Developed resistance**: This means that the cancer no longer responds to the standard treatment. This can be defined by the tumor growing or not shrinking in size despite treatment, or the cancer markers take not being reduced

work page
[36]

standard treatment

**Standard treatment**: This refers to the most widely accepted and utilized treatments for a particular type of cancer, as established by clinical research and guidelines. Examples might include first-line chemotherapy regimens or targeted therapies. Exact "standard treatment" could vary based on cancer type, but usually involves those strategies that ha...

work page

[1] [1]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Compose: Cross-modal pseudo-siamese net- work for patient trial matching. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 803–812. Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shi- rong Ma, Peiyi Wang, Xiao Bi, and 1 others. 2025. Deepseek-r1: Incentivizing...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[2] [2]

advanced

Zero-shot clinical trial patient matching with llms. arXiv preprint arXiv:2402.05125. Xingyao Zhang, Cao Xiao, Lucas M Glass, and Jimeng Sun. 2020. Deepenroll: patient-trial matching with deep embedding and entailment prediction. In Pro- ceedings of the web conference 2020 , pages 1029– 1037. Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua ...

work page arXiv 2020

[3] [3]

Locally advanced or metastatic breast cancer confirmed by histopathology

work page

[4] [4]

Received 2 previous lines of anti-tumor treat- ment and developed resistance to standard treat- ment

work page

[5] [5]

Patients with locally advanced or metastatic ad- vanced solid tumors confirmed by histology or cy- tology (including but not limited to triple-negative breast cancer, gastric cancer, colorectal cancer); Patients with 1 line of standard treatment failure (disease progression after treatment or intolerabil- ity of toxic side effects of treatment), or no sta...

work page

[6] [6]

At least 1 measurable lesion per Response Eval- uation Criteria in Solid Tumors version 1.1 and previously treated lesions with radiotherapy or fo- cal therapy and no progression cannot be included as target lesion for assessment

work page

[7] [7]

Histologically confirmed breast cancer with an invasive component measuring 20 mm and/or with morphologically confirmed spread to regional lymph nodes (stage cT2-cT4 with any cN, or cN1- cN3 with any cT)

work page

[8] [8]

If IHC is 2+, a positive in situ hybridization (FISH, CISH, or SISH) test is required by local laboratory testing

Known HER2-positive breast cancer defined as an IHC status of 3+. If IHC is 2+, a positive in situ hybridization (FISH, CISH, or SISH) test is required by local laboratory testing

work page

[9] [9]

Patients with breast cancer planned to receive Anthracycline and Cyclophosphamide chemother- apy

work page

[10] [10]

Patients treated for hormone-dependent local- ized breast cancer requiring adjuvant hormonal therapy (HT)

work page

[11] [11]

Patients treated with tamoxifen for a maximum of 1 to 3 years

work page

[12] [12]

Histologically confirmed ER+/HER2- early- stage resected invasive breast cancer at high or intermediate risk of recurrence, based on clinical- pathological risk features, as defined in the proto- col

work page

[13] [13]

Completed adequate (definitive) locoregional therapy (surgery with or without radiotherapy) 10 for the primary breast tumour(s), with or without (neo)adjuvant chemotherapy

work page

[14] [14]

Completed at least 2 years but no more than 5 years (+3 months) of adjuvant ET (+/- CDK4/6 inhibitor)

work page

[15] [15]

Patients receiving letrozole for more than two months

work page

[16] [16]

Postmenopausal hormone receptor-positive pa- tients

work page

[17] [17]

Exclusion Criteria

Patients with N0 disease or those who have undergone targeted axillary detection with a good response. Exclusion Criteria

work page

[18] [18]

Patients known or suspected intolerance or hy- persensitivity to main ingredient or any of the ex- cipients of SNB-101

work page

[19] [19]

Patients with bilateral invasive breast cancers

work page

[20] [20]

Age <18 years old or >70 years old

work page

[21] [21]

Patients with standard metallic contraindications to CMR or estimated glomerular filtration rate<30 mL/min/1.73 m²

work page

[22] [22]

History of hypersensitivity or contraindication to TTF

work page

[23] [23]

Implanted pacemaker, defibrillator, or other elec- trical medical devices

work page

[24] [24]

All histological lesions were HER2 1+ or 2+, not detected by FISH amplification and HR positive according to ASCO guidelines)

Bilateral breast cancer (including multifocal breast cancer. All histological lesions were HER2 1+ or 2+, not detected by FISH amplification and HR positive according to ASCO guidelines)

work page

[25] [25]

Received any form of anti-tumor therapy (chemotherapy, radiotherapy, molecular targeted therapy, endocrine therapy, etc.)

work page

[26] [26]

Breast cancer without histopathological diagno- sis

work page

[27] [27]

Known personal history of ductal carcinoma in situ (DCIS) or invasive breast cancer

work page

[28] [28]

Prior systemic treatment for any malignancy

work page

[29] [29]

Significant medical comorbidities as per inves- tigator evaluation

work page

[30] [30]

Patients with ongoing indications for the car- dioprotective medication - ACE inhibitors, ARBs and/or beta-blockers

work page

[31] [31]

Intermediate or high-grade ductal carcinoma in situ

work page

[32] [32]

B.5 Example Criteria from ClinicalTrials.gov refined by GPT-4o Criterion: Locally advanced or metastatic breast cancer confirmed by histopathology

Invasive carcinoma. B.5 Example Criteria from ClinicalTrials.gov refined by GPT-4o Criterion: Locally advanced or metastatic breast cancer confirmed by histopathology. Additional Explanation: - Locally advanced breast cancer refers to cancer that has spread beyond the breast to nearby areas but not to distant body parts. It often involves large tumors or ...

work page

[33] [33]

**Anti-tumor treatment**: This refers to any therapy aimed at treating cancer, which might include chemotherapy, targeted therapy, immunotherapy, or hormone therapy

work page

[34] [34]

The patient must have undergone two different complete treatment sequences

**2 previous lines of treatment**: In cancer therapy, a "line" of treatment refers to a complete sequence of therapy given for cancer, which can consist of one or multiple medications or interventions. The patient must have undergone two different complete treatment sequences

work page

[35] [35]

This can be defined by the tumor growing or not shrinking in size despite treatment, or the cancer markers take not being reduced

**Developed resistance**: This means that the cancer no longer responds to the standard treatment. This can be defined by the tumor growing or not shrinking in size despite treatment, or the cancer markers take not being reduced

work page

[36] [36]

standard treatment

**Standard treatment**: This refers to the most widely accepted and utilized treatments for a particular type of cancer, as established by clinical research and guidelines. Examples might include first-line chemotherapy regimens or targeted therapies. Exact "standard treatment" could vary based on cancer type, but usually involves those strategies that ha...

work page