arxiv: 2604.23802 · v1 · submitted 2026-04-26 · 💻 cs.MA

Recognition: unknown

EndoGov: A knowledge-governed multi-agent expert system for endometrial cancer risk stratification

Weiye Dai , Liyun Shi , Zanxiang He , Yuling Ma , Mengyuan Lin , Dianxiang Sun , Liming Nie

Authors on Pith no claims yet

Pith reviewed 2026-05-08 04:56 UTC · model grok-4.3

classification 💻 cs.MA

keywords endometrial cancerrisk stratificationmulti-agent systemclinical guidelinesknowledge graphexpert systemgovernance agentPOLE mutation

0 comments

The pith

EndoGov separates evidence extraction by specialist agents from rule application by a governance agent to enforce clinical guidelines in endometrial cancer risk assignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a two-tier multi-agent system that first lets pathology, molecular, and clinical agents pull structured evidence from images or records, then routes that evidence to a governance agent. The governance agent consults an executable rule set drawn from guidelines and applies hard overrides for priority cases such as POLE-mutated tumors. This factorization produces risk labels that remain compliant with mandatory rules while matching or exceeding the discrimination of standard neural models on two separate cohorts. A sympathetic reader would care because current multimodal models often optimize raw accuracy at the expense of violating explicit clinical mandates, leaving decisions that are hard to audit or trust. The approach therefore offers a concrete route to auditable, guideline-respecting automation without sacrificing performance.

Core claim

The decision process is factorized as D(x) = G(P(x), R), where specialist agents P generate schema-constrained evidence reports and the governance agent G applies an executable rule set R drawn from a guideline knowledge graph using deterministic hard paths for overrides and constrained soft-path reasoning for ambiguous cases. On the TCGA-UCEC cohort this yields 0.943 accuracy and 0.973 macro AUC with a conditional logic-violation rate of 0.93 percent among trigger-exposed cases; on the CPTAC-UCEC cohort, where labels are themselves guideline-derived, accuracy reaches 0.842 while locked-transfer neural baselines fall below 0.31. Residual failures localize to upstream evidence extraction, and

What carries the argument

The governance agent G that applies an executable rule set R from the Guideline Knowledge Graph to evidence produced by specialist agents P.

If this is right

Mandatory guideline overrides such as the POLE low-risk assignment remain enforced regardless of conflicting morphologic features.
Safety failures can be isolated to the evidence-extraction tier rather than the rule-application tier.
Hard-path compliance is preserved when the underlying language-model backend is replaced.
Performance advantage over standard models is maintained under distribution shift when reference labels follow the same guidelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same separation of extraction and governance could be ported to other cancer types whose guidelines contain clear override rules.
Audit logs produced by the governance layer supply the traceability required for regulatory review of medical AI.
Extending the knowledge graph with new guidelines would allow incremental updates without retraining the entire system.

Load-bearing premise

The executable rule set derived from guidelines is complete, conflict-free, and correctly encodes all mandatory overrides while the specialist agents reliably extract every evidence field those rules require.

What would settle it

A single case in which a POLE-mutated high-grade tumor is not assigned to the low-risk group by the governance agent would demonstrate failure of the hard-path override mechanism.

Figures

Figures reproduced from arXiv: 2604.23802 by Dianxiang Sun, Liming Nie, Liyun Shi, Mengyuan Lin, Weiye Dai, Yuling Ma, Zanxiang He.

**Figure 1.** Figure 1: Overview of EndoGov (two horizontal bands). Upper: runtime pipeline from multimodal inputs (pathology WSI, molecular omics, structured clinical records) to three Tier 1 specialist agents (UNI+prototype pathology matching, scGPT+molecular report, and FIGO-guided clinical summarization), then the Tier 2 chair agent, which uses the GuidelineKG and routes each case through hard-path priority arbitration or so… view at source ↗

**Figure 2.** Figure 2: Runtime contract over the Guideline-KG. Input (left): structured patient evidence 𝑋 is extracted from specialist reports (𝑅𝑝𝑎𝑡ℎ, 𝑅𝑚𝑜𝑙, 𝑅𝑐𝑙𝑖) into clinically typed variables, including molecular subtype, FIGO stage, histology, grade, myometrial invasion, and modifier flags such as LVSI, deep myometrial invasion, and no-myometrial-invasion status. Graph query (center): the offline governance memory is querie… view at source ↗

**Figure 3.** Figure 3: Ablation performance drop on aligned TCGA-UCEC. Bars show Macro-F1 for the full EndoGov pipeline and component-removal controls; the dashed vertical line marks the full-model reference (Macro-F1 = 0.923). Replacing the LLM chair with a logistic-regression soft-path resolver produces a moderate drop, removing the soft-path resolver or semantic KG context produces larger degradation, and replacing in-loop go… view at source ↗

**Figure 4.** Figure 4: Guideline-grounded reasoning chain for an atypical POLE-mutated case (TCGA-AP-A051). Left: a conventional fusion baseline treats high-grade serous morphology as dominant evidence and predicts High risk, missing the POLE override. Right: EndoGov parses the conflicting evidence, retrieves the source-linked R1_POLE clause, applies priority arbitration rather than averaging evidence streams, and validates the … view at source ↗

read the original abstract

Multimodal artificial intelligence models for endometrial cancer (EC) risk stratification typically optimize aggregate predictive performance but provide limited mechanisms for enforcing mandatory guideline overrides, such as assigning POLE-mutated tumors to the low-risk group despite high-grade morphology. We present EndoGov, a two-tier multi-agent expert system that factorizes the decision process as D(x) = G(P(x), R), where specialist agents P extract structured evidence and a governance agent G applies an executable rule set R. Tier 1 comprises pathology, molecular, and clinical agents that independently generate schema-constrained reports from frozen foundation-model features or structured records. Tier 2 queries an evidence-level-weighted Guideline Knowledge Graph, using deterministic hard-path rules for high-priority overrides and constrained soft-path reasoning for ambiguous cases. In TCGA-UCEC (n=541), EndoGov achieved 0.943 accuracy, 0.973 macro AUC, and a conditional logic-violation rate (C-LVR) of 0.93% among trigger-exposed cases. In CPTAC-UCEC (n=95), where reference labels are guideline-derived, EndoGov reached 0.842 accuracy compared with < 0.31 for locked-transfer neural baselines, supporting governance-pathway transfer under distribution shift rather than validation against independent clinical truth. End-to-end safety decomposition localized residual failures primarily to upstream molecular detection rather than downstream governance. Backend-swap experiments further showed that hard-path compliance is invariant to the LLM backend. These findings indicate that explicit clinical-rule governance can provide guideline-compliant, auditable EC risk assignment while preserving competitive discrimination.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EndoGov gives a clean two-tier split between evidence agents and a governance rule layer for endometrial cancer risk, with decent TCGA numbers and better CPTAC transfer than baselines, but the rule set's actual match to guidelines is unverified.

read the letter

The paper's main move is to factor the risk stratification as D(x) = G(P(x), R), where P are specialist agents pulling structured evidence from pathology, molecular, and clinical data, and G applies an executable rule set from a guideline knowledge graph with hard and soft paths. This is a direct way to build in mandatory overrides like POLE-mutated low-risk assignments without relying on the model to learn them implicitly.

Referee Report

2 major / 2 minor

Summary. The paper presents EndoGov, a two-tier multi-agent expert system for endometrial cancer risk stratification that factorizes the decision as D(x) = G(P(x), R), where Tier-1 specialist agents P extract structured evidence from multimodal inputs and a Tier-2 governance agent G applies an executable rule set R derived from clinical guidelines via a Guideline Knowledge Graph. It reports 0.943 accuracy, 0.973 macro AUC, and 0.93% conditional logic-violation rate (C-LVR) on TCGA-UCEC (n=541), 0.842 accuracy on CPTAC-UCEC (n=95), with failures localized to upstream molecular detection and hard-path compliance invariant to LLM backend.

Significance. If the rule set R is shown to be a complete and faithful encoding of the source guidelines (including mandatory overrides such as POLE-mutated tumors), and the specialist agents reliably extract the required evidence fields, the approach would demonstrate that explicit clinical-rule governance can deliver auditable, guideline-compliant risk assignments with competitive discrimination and robustness under distribution shift. The safety decomposition, backend-swap invariance, and outperformance versus locked-transfer neural baselines on CPTAC are concrete strengths supporting the central claim.

major comments (2)

[Abstract] Abstract and methods (rule-set construction): The central claim that EndoGov provides 'guideline-compliant' EC risk assignment rests on R being a complete, conflict-free encoding of the referenced clinical guidelines (including all mandatory overrides such as POLE-mutated tumors to low-risk). No independent mapping, expert audit, or completeness check of R against the source guidelines is described; the reported metrics only confirm that G follows the implemented R, not that R matches the guidelines. This is load-bearing for the 'guideline-compliant' and 'auditable' assertions.
[Abstract] Abstract (Tier-1 agents): The factorization D(x) = G(P(x), R) and the claim of reliable evidence extraction assume that the pathology, molecular, and clinical agents P extract exactly the schema-constrained fields required by R. No quantitative evaluation of extraction accuracy or error propagation from P to G is provided beyond the overall accuracy and C-LVR; residual failures are localized to 'upstream molecular detection' but without per-agent metrics this remains unverified.

minor comments (2)

The definition and computation of the conditional logic-violation rate (C-LVR) among trigger-exposed cases should be formalized with an equation or pseudocode to allow independent reproduction.
The manuscript would benefit from an explicit table or figure showing the structure of the Guideline Knowledge Graph and the hard-path versus soft-path rules.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, offering clarifications and committing to revisions that strengthen the presentation without overstating what was performed.

read point-by-point responses

Referee: [Abstract] Abstract and methods (rule-set construction): The central claim that EndoGov provides 'guideline-compliant' EC risk assignment rests on R being a complete, conflict-free encoding of the referenced clinical guidelines (including all mandatory overrides such as POLE-mutated tumors to low-risk). No independent mapping, expert audit, or completeness check of R against the source guidelines is described; the reported metrics only confirm that G follows the implemented R, not that R matches the guidelines. This is load-bearing for the 'guideline-compliant' and 'auditable' assertions.

Authors: We agree that the 'guideline-compliant' and 'auditable' claims require evidence that R faithfully encodes the source guidelines. R was derived by clinical collaborators translating explicit decision criteria and mandatory overrides (including POLE-mutated tumors to low-risk) from NCCN and ESMO guidelines into the executable rules of the Guideline Knowledge Graph. The reported C-LVR of 0.93% verifies that G strictly follows the implemented R. We acknowledge, however, that the manuscript does not include an independent expert audit or formal completeness mapping against the full guideline documents. In the revised version we will add a supplementary table that lists each key guideline statement alongside its corresponding rule in R, together with a methods paragraph describing the construction process. This provides the requested traceability while accurately reflecting that a blinded external audit was not conducted. revision: partial
Referee: [Abstract] Abstract (Tier-1 agents): The factorization D(x) = G(P(x), R) and the claim of reliable evidence extraction assume that the pathology, molecular, and clinical agents P extract exactly the schema-constrained fields required by R. No quantitative evaluation of extraction accuracy or error propagation from P to G is provided beyond the overall accuracy and C-LVR; residual failures are localized to 'upstream molecular detection' but without per-agent metrics this remains unverified.

Authors: We concur that per-agent extraction metrics would strengthen the factorization claim and clarify error propagation. The manuscript already localizes residual failures to upstream molecular detection via the safety decomposition and shows that hard-path compliance remains high even when upstream extraction is imperfect. To address the gap, the revision will include quantitative accuracy figures for each Tier-1 agent (pathology, molecular, clinical) on cases with available ground-truth annotations, plus an explicit error-propagation analysis showing how extraction mistakes affect final risk assignments. These additions will be placed in the methods and results sections. revision: yes

Circularity Check

0 steps flagged

No significant circularity: rule set R is externally derived from guidelines and performance metrics are independent empirical measurements

full rationale

The paper's derivation chain centers on the explicit factorization D(x) = G(P(x), R), where R is an executable rule set drawn from clinical guidelines (including mandatory overrides such as POLE-mutated tumors) and P consists of specialist agents that extract schema-constrained evidence. This structure treats R as an independent, auditable input rather than a parameter fitted to the target labels or performance data. The reported metrics (0.943 accuracy and 0.973 macro AUC on TCGA-UCEC; 0.842 accuracy on CPTAC-UCEC) are presented as measurements of the system's behavior under this governance, not as the definitional success criterion. The conditional logic-violation rate (C-LVR) quantifies residual implementation deviations (localized to upstream extraction) rather than redefining compliance by construction. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked as load-bearing justifications. The architecture therefore remains self-contained against external benchmarks, with guideline grounding supplying the non-circular foundation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The system adds a governance mechanism on top of standard foundation-model feature extraction and clinical guidelines; it does not introduce new physical entities or free parameters but relies on the assumption that the guideline rules can be faithfully encoded as executable logic.

axioms (2)

domain assumption The clinical guidelines contain a complete, non-conflicting set of mandatory overrides that can be expressed as deterministic hard-path rules and constrained soft-path reasoning.
The governance agent G applies an executable rule set R drawn from guidelines; correctness of risk assignment depends on this encoding being faithful and exhaustive.
domain assumption Specialist agents P can reliably extract the exact schema-constrained evidence fields required by the rule set from frozen foundation-model features or structured records.
Tier 1 performance directly determines whether the governance layer receives the inputs it needs; any systematic extraction error propagates to final risk labels.

pith-pipeline@v0.9.0 · 5608 in / 1601 out tokens · 19702 ms · 2026-05-08T04:56:31.562984+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 23 canonical work pages · 2 internal anchors

[1]

MRI-based radiomic model for preoperative risk stratification in stage I endometrial cancer

Chen, J., Gu, H., Fan, W., Wang, Y., Chen, S., Chen, X., Wang, Z., 2021a. MRI-based radiomic model for preoperative risk stratification in stage I endometrial cancer. J. Cancer 12, 726–734. doi:10.7150/jca.50872

work page doi:10.7150/jca.50872
[2]

Multimodal co-attention transformer for survival prediction in gigapixel whole slide images, in: ICCV, pp

Chen, R., Lu, M., et al., 2021b. Multimodal co-attention transformer for survival prediction in gigapixel whole slide images, in: ICCV, pp. 4015–4025
[3]

Pan-cancer integrative histology-genomic analysis via multimodal deep learning

Chen, R., Lu, M., et al., 2022a. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 865–878
[4]

Scalingvisiontransformerstogigapixelimagesviahierarchicalself-supervisedlearning,in:CVPR,pp.16144–16155

Chen,R.,etal.,2022b. Scalingvisiontransformerstogigapixelimagesviahierarchicalself-supervisedlearning,in:CVPR,pp.16144–16155
[5]

A general-purpose self-supervised model for computational pathology, in: CVPR, pp

Chen, R., et al., 2024. A general-purpose self-supervised model for computational pathology, in: CVPR, pp. 1–10

2024
[6]

Prognostic risk modeling of endometrial cancer using programmed cell death-related genes: a comprehensive machine learning approach

Chen, T., Yang, Y., Huang, Z., Pan, F., Xiao, Z., Gong, K., Huang, W., Xu, L., Liu, X., Fang, C., 2025. Prognostic risk modeling of endometrial cancer using programmed cell death-related genes: a comprehensive machine learning approach. Discover Oncology 16, 280. doi:10.1007/s12672-025-02039-8

work page doi:10.1007/s12672-025-02039-8 2025
[7]

Chen, Y., Zhao, W., Yu, L., 2023. Transformer-based multimodal fusion for survival prediction by integrating whole slide images, clinical, and genomic data, in: 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), pp. 1–5. doi:10.1109/ISBI53787.2023. 10230804

work page doi:10.1109/isbi53787.2023 2023
[8]

Prognostic significance of POLE proofreading mutations in endometrial cancer

Church, D.N., Stelloo, E., Nout, R.A., et al., 2015. Prognostic significance of POLE proofreading mutations in endometrial cancer. Journal of the National Cancer Institute 107, dju402

2015
[9]

ESGO/ESTRO/ESP guidelines for the management of patients with endometrial carcinoma

Concin, N., Matias-Guiu, X., Vergote, I., Cibula, D., Mirza, M., Marnitz, S., Ledermann, J., Bosse, T., et al., 2021. ESGO/ESTRO/ESP guidelines for the management of patients with endometrial carcinoma. Int. J. Gynecol. Cancer 31, 12–39

2021
[10]

Immunologic signatures across molecular subtypes and potential biomarkers for sub-stratification in endometrial cancer

Costas, L., Frias-Gomez, J., et al., 2023. Immunologic signatures across molecular subtypes and potential biomarkers for sub-stratification in endometrial cancer. International Journal of Molecular Sciences 24, 1791

2023
[11]

scGPT: towards building a foundation model for single-cell multi-omics using generative AI

Cui, H., Wang, C., Maan, H., et al., 2024. scGPT: towards building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 933–944. Dai et al.:Preprint submitted to ElsevierPage 25 of 27 EndoGov: Knowledge-Governed Multi-Agent EC Risk Stratification

2024
[12]

Comparingtheareasundertwoormorecorrelatedreceiveroperatingcharacteristic curves: a nonparametric approach

DeLong,E.R.,DeLong,D.M.,Clarke-Pearson,D.L.,1988. Comparingtheareasundertwoormorecorrelatedreceiveroperatingcharacteristic curves: a nonparametric approach. Biometrics 44, 837–845

1988
[13]

TITAN: A multimodal whole-slide foundation model for pathology

Ding, T., Wagner, S.J., Song, A.H., Chen, R.J., Lu, M.Y., Zhang, A., et al., 2024. TITAN: A multimodal whole-slide foundation model for pathology. arXiv preprint arXiv:2411.19666

work page arXiv 2024
[14]

Proteogenomic characterization of endometrial carcinoma

Dou, Y., et al., 2020. Proteogenomic characterization of endometrial carcinoma. Cell 182, 1–22

2020
[15]

AI-based histopathology image analysis reveals a distinct subset of endometrial cancers

Fremond, S., et al., 2024. AI-based histopathology image analysis reveals a distinct subset of endometrial cancers. Nat. Commun. 15, 1–12

2024
[16]

The Llama 3 Herd of Models

Grattafiori, A., Dubey, A., Jauhri, A., et al., 2024. The Llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review arXiv 2024
[17]

Improved preoperative risk stratification in endometrial carcinoma patients: external validation of the ENDORISK Bayesian network model in a large population-based case series

Grube, M., Reijnen, C., Lucas, P.J.F., et al., 2023. Improved preoperative risk stratification in endometrial carcinoma patients: external validation of the ENDORISK Bayesian network model in a large population-based case series. J. Cancer Res. Clin. Oncol. 149, 7555–7565

2023
[18]

Population-based screening for endometrial cancer: Human vs

Hart, G.R., Yan, V., Huang, G.S., Liang, Y., Nartowt, B.J., Muhammad, W., Deng, J., 2020. Population-based screening for endometrial cancer: Human vs. machine intelligence. Front. Artif. Intell. 3, 539879

2020
[19]

arXiv preprint arXiv:2512.15398 doi:10.48550/arXiv.2512.15398

He,Z.,Li,M.,Shi,L.,Dai,W.,Nie,L.,2025.Mapis:Aknowledge-graphgroundedmulti-agentframeworkforevidence-basedPCOSdiagnosis. arXiv preprint arXiv:2512.15398 doi:10.48550/arXiv.2512.15398

work page doi:10.48550/arxiv.2512.15398 2025
[20]

Attention-based deep multiple instance learning, in: ICML, pp

Ilse, M., Tomczak, J., Welling, M., 2018. Attention-based deep multiple instance learning, in: ICML, pp. 2127–2136

2018
[21]

Modeling dense multimodal interactions between biological pathways and histology for survival prediction, in: CVPR, pp

Jaume, G., Vaidya, A., Chen, R.J., Williamson, D.F., Liang, P.P., Mahmood, F., 2024. Modeling dense multimodal interactions between biological pathways and histology for survival prediction, in: CVPR, pp. 11579–11590

2024
[22]

Risk stratification of endometrial cancer patients: FIGO stage, biomarkers and molecular classification

Kasius, J.C., Pijnenborg, J.M.A., Lindemann, K., Forsse, D., van Zwol, J., Kristensen, G.B., Krakstad, C., Werner, H.M.J., Amant, F., 2021. Risk stratification of endometrial cancer patients: FIGO stage, biomarkers and molecular classification. Cancers 13, 5848

2021
[23]

Deep learning models differentiate tumor grades from H&E stained histology sections

Khoshdeli, M., Borowsky, A., Parvin, B., 2018. Deep learning models differentiate tumor grades from H&E stained histology sections. 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) , 620–623

2018
[24]

EndometrialcancerriskstratificationusingMRIradiomics:corroboratingwithcholinemetabolism

Lin,Y.,Wu,R.C.,Lin,Y.C.,Huang,Y.L.,Lin,C.Y.,Lo,C.J.,Lu,H.Y.,Lu,K.Y.,Tsai,S.Y.,Hsieh,C.Y.,Yang,L.Y.,Cheng,M.L.,Chao,A., Lai,C.H.,Lin,G.,2024. EndometrialcancerriskstratificationusingMRIradiomics:corroboratingwithcholinemetabolism. CancerImaging 24, 112. doi:10.1186/s40644-024-00756-x

work page doi:10.1186/s40644-024-00756-x 2024
[25]

EvoMDT: A self-evolving multi-agent system for structured clinical decision-making in multi-cancer

Liu, Q., Hu, Z., Huang, T., Niu, Y., Zhang, X., Ma, S., Lin, C., Goh, K.H., Kwon, H.E., Gao, F., Sun, X., Ying, Z., Qiang, G., 2026. EvoMDT: A self-evolving multi-agent system for structured clinical decision-making in multi-cancer. npj Digital Medicine 9, 124. doi:10.1038/s41746-025-02304-8

work page doi:10.1038/s41746-025-02304-8 2026
[26]

Predicting risk stratification in early-stage endometrial carcinoma: significance of multiparametric MRI radiomics model

Meng, H., Sun, Y., Zhang, Y., et al., 2024. Predicting risk stratification in early-stage endometrial carcinoma: significance of multiparametric MRI radiomics model. Journal of Digital Imaging 37, 230–239. doi:10.1007/s10278-023-00936-4

work page doi:10.1007/s10278-023-00936-4 2024
[27]

Foundation models for generalist medical artificial intelligence

Moor, M., et al., 2023. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265

2023
[28]

Nzomo, M., Moodley, D., 2025. Integrating knowledge graphs and bayesian networks: A hybrid approach for explainable disease risk prediction, in: Proceedings of the 2025 IEEE 49th Annual Computers, Software, and Applications Conference, pp. 834–844

2025
[29]

Endometrial cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up

Oaknin,A.,Bosse,T.,Creutzberg,C.,Giornelli,G.,Harter,P.,Joly,F.,Lorusso,D.,Marth,C.,Makker,V.,Mirza,M.,etal.,2022. Endometrial cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Ann. Oncol. 33, 860–877

2022
[30]

GPT-4o system card

OpenAI, 2024. GPT-4o system card. OpenAI Technical Report Available athttps://openai.com/index/gpt-4o-system-card/

2024
[31]

Staining invariant features for improving generalization of deep convolutional neural networks in computational pathology

Otálora, S., Atzori, M., Andrearczyk, V., Khan, A., Müller, H., 2019. Staining invariant features for improving generalization of deep convolutional neural networks in computational pathology. Frontiers in Bioengineering and Biotechnology 7, 198

2019
[32]

Construction of prognostic risk assessment model of endometrial cancer based on miRNAs

Peng, X., Kong, Y., Yan, Z., 2023. Construction of prognostic risk assessment model of endometrial cancer based on miRNAs. Molecular and Cellular Biochemistry 478, 2767–2783

2023
[33]

Clinical-grade AI model for molecular subtyping of endometrial cancer: a multi-center cohort study in China

Qi, P., Yao, T., Li, H., et al., 2025. Clinical-grade AI model for molecular subtyping of endometrial cancer: a multi-center cohort study in China. Molecular Biomedicine 6, 72. doi:10.1186/s43556-025-00341-z

work page doi:10.1186/s43556-025-00341-z 2025
[34]

POLE-relatedgenesignaturepredictsprognosis,immunefeature,anddrugtherapyinhumanendometrioid carcinoma

Qiu,W.,Zhang,R.,Qian,Y.,2024. POLE-relatedgenesignaturepredictsprognosis,immunefeature,anddrugtherapyinhumanendometrioid carcinoma. Heliyon 10, e29548. doi:10.1016/j.heliyon.2024.e29548

work page doi:10.1016/j.heliyon.2024.e29548 2024
[35]

Preoperativeriskstratificationinendometrialcancer(ENDORISK)byaBayesiannetwork model: A development and validation study

Reijnen,C.,Gogou,E.,Visser,N.C.M.,etal.,2020. Preoperativeriskstratificationinendometrialcancer(ENDORISK)byaBayesiannetwork model: A development and validation study. PLoS Med. 17, e1003111

2020
[36]

Improved preoperative risk stratification with CA-125 in low-grade endometrial cancer: a multicenter prospective cohort study

Reijnen, C., Visser, N.C.M., Kasius, J.C., 2019. Improved preoperative risk stratification with CA-125 in low-grade endometrial cancer: a multicenter prospective cohort study. Journal of Gynecologic Oncology 30, e70

2019
[37]

Agentclinic: a multimodal agent benchmark to evaluate ai in simulated clinical envi- ronments

Schmidgall, S., et al., 2024. AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments. arXiv preprint arXiv:2405.07960

work page arXiv 2024
[38]

Clarity: Clinical assistant for routing, inference, and triage

Shaposhnikov, V., Nesterov, A., Kopanichuk, I., Bakulin, I., Zhelvakov, E., Abramov, R., et al., 2025. Clarity: Clinical assistant for routing, inference, and triage. arXiv preprint arXiv:2510.02463

work page arXiv 2025
[39]

Large language models encode clinical knowledge

Singhal, K., et al., 2023. Large language models encode clinical knowledge. Nature 620, 172–180

2023
[40]

Deep neural network models for computational histopathology: A survey

Srinidhi, C.L., Ciga, O., Martel, A.L., 2021. Deep neural network models for computational histopathology: A survey. Med. Image Anal. 67, 101813. doi:10.1016/j.media.2020.101813

work page doi:10.1016/j.media.2020.101813 2021
[41]

Improved risk assessment by integrating molecular and clinicopathological factors in early- stage endometrial cancer: Combined analysis of the PORTEC cohorts

Stelloo, E., Nout, R.A., Osse, E.M., et al., 2016. Improved risk assessment by integrating molecular and clinicopathological factors in early- stage endometrial cancer: Combined analysis of the PORTEC cohorts. Clinical Cancer Research 22, 4215–4224

2016
[42]

Cpath-omni: A unified multimodal foundation model for patch and whole slide image analysis in computational pathology

Sun, Y., Si, Y., Zhu, C., Gong, X., Zhang, K., Chen, P., Zhang, Y., Shui, Z., Lin, T., Yang, L., 2024. Cpath-omni: A unified multimodal foundation model for patch and whole slide image analysis in computational pathology. arXiv preprint arXiv:2412.12077

work page arXiv 2024
[43]

Integrated genomic characterization of endometrial carcinoma

The Cancer Genome Atlas Research Network, 2013. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67–73

2013
[44]

Towards generalist biomedical AI

Tu, T., et al., 2024. Towards generalist biomedical AI. NEJM AI 1, AIoa2300138

2024
[45]

Clinical Cancer Research 21, 3347–3355

VanGool,I.C.,Eggink,F.A.,Freeman-Mills,L.,etal.,2015.POLEproofreadingmutationselicitanantitumorimmuneresponseinendometrial cancer. Clinical Cancer Research 21, 3347–3355

2015
[46]

Prediction of recurrence risk in endometrial cancer with multimodal deep learning

Volinsky-Fremond, S., et al., 2024. Prediction of recurrence risk in endometrial cancer with multimodal deep learning. Nat. Med. 30, 1962–1973. Dai et al.:Preprint submitted to ElsevierPage 26 of 27 EndoGov: Knowledge-Governed Multi-Agent EC Risk Stratification

2024
[47]

A foundation model for clinical-grade computational pathology and rare cancers detection

Vorontsov, E., et al., 2024. A foundation model for clinical-grade computational pathology and rare cancers detection. Nat. Med. 30, 2924– 2935

2024
[48]

A machine learning-based immune response signature to facilitate prognosis prediction in patients with endometrial cancer

Wang, X., Guan, J., Feng, L., 2024. A machine learning-based immune response signature to facilitate prognosis prediction in patients with endometrial cancer. Journal of Translational Medicine 22, 1–14

2024
[49]

Molecular and AI enabled prognostication in endometrial cancer: a 2015 to 2024 bibliometric atlas and critical review

Wang, X., Wang, Q., Ding, G., Wang, J., Feng, Y., 2026. Molecular and AI enabled prognostication in endometrial cancer: a 2015 to 2024 bibliometric atlas and critical review. Discover Oncology 17, 521. doi:10.1007/s12672-026-04734-6

work page doi:10.1007/s12672-026-04734-6 2026
[50]

Frequent POLE-driven hypermutation in ovarian endometrioid cancer revealed by mutational signatures in RNA sequencing

Wang, Y., et al., 2021. Frequent POLE-driven hypermutation in ovarian endometrioid cancer revealed by mutational signatures in RNA sequencing. BMC Medical Genomics 14, 186

2021
[51]

Automated construction of medical indicator knowledge graphs using retrieval augmented large language models

Wang, Z., Shi, D., Zhao, J., Diao, X., Tang, X., Qin, Y., 2025. Automated construction of medical indicator knowledge graphs using retrieval augmented large language models. arXiv preprint arXiv:2511.13526

work page arXiv 2025
[52]

Wu, J., Zhu, J., Qi, Y., Chen, J., Xu, M., Menolascina, F., Jin, Y., Grau, V., 2025. Medical Graph RAG: Towards safe medical large language modelviagraphretrieval-augmentedgeneration,in:Proceedingsofthe63rdAnnualMeetingoftheAssociationforComputationalLinguistics (ACL), pp. 28443–28467

2025
[53]

HGTDG-net: an interpretable heterogeneous graph transformer framework for cancer driver gene prediction

Xiong, S., Wang, Z., Zhang, J., 2023. HGTDG-net: an interpretable heterogeneous graph transformer framework for cancer driver gene prediction. Briefings in Bioinformatics 24, bbad223

2023
[54]

A whole-slide foundation model for digital pathology from real-world data

Xu, H., et al., 2024. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188

2024
[55]

et al.: MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow (Jul 2025)

Xu, J., et al., 2025a. MedAgent-Pro: Towards evidence-based multi-modal medical diagnosis via reasoning agentic workflow. arXiv preprint arXiv:2503.18968

work page arXiv
[56]

Multimodaloptimaltransport-basedco-attentiontransformerwithglobalstructureconsistencyforsurvivalprediction, in: ICCV, pp

Xu,Y.,Chen,H.,2023. Multimodaloptimaltransport-basedco-attentiontransformerwithglobalstructureconsistencyforsurvivalprediction, in: ICCV, pp. 21241–21251

2023
[57]

2507.17303

Xu,Z.,Liu,Z.,Hou,J.,Ma,J.,Jin,C.,Wang,Y.,Chen,Z.,Zhang,Z.,Huang,F.,Guo,Z.,Zhou,F.,Xu,Y.,Wang,X.,Chan,R.C.K.,Liang,L., Chen,H.,2025b. Aversatilepathologyco-pilotviareasoningenhancedmultimodallargelanguagemodel. arXivpreprintarXiv:2507.17303

work page arXiv
[58]

PathOrchestra:acomprehensivefoundationmodelforcomputationalpathology with over 100 diverse clinical-grade tasks

Yan,F.,Wu,J.,Li,J.,Wang,W.,Lu,J.,Chen,W.,etal.,2025. PathOrchestra:acomprehensivefoundationmodelforcomputationalpathology with over 100 diverse clinical-grade tasks. npj Digit. Med. 8, 1–15

2025
[59]

Qwen2.5 Technical Report

Yang, A., Yang, B., Zhang, B., et al., 2024. Qwen2.5 technical report. arXiv preprint arXiv:2412.15115

work page internal anchor Pith review arXiv 2024
[60]

Uncovering stromal cell fate genes and a novel risk stratification in UCEC by integrating single-cell RNA sequencing and multi-omics analysis

Zhang, R., Ma, H., Yang, Y., Lv, S., et al., 2026a. Uncovering stromal cell fate genes and a novel risk stratification in UCEC by integrating single-cell RNA sequencing and multi-omics analysis. Genes & Diseases 13, 101460. doi:10.1016/j.gendis.2025.101460

work page doi:10.1016/j.gendis.2025.101460 2025
[61]

OMGs: A multi-agent system supporting MDT decision-making across the ovarian tumour care continuum

Zhang, Z., Wang, Z., Xu, J., et al., 2026b. OMGs: A multi-agent system supporting MDT decision-making across the ovarian tumour care continuum. arXiv preprint arXiv:2602.13793

work page arXiv
[62]

Kg4diagnosis: A hierarchical multi-agent LLM framework with knowledge graph enhancement

Zuo, K., et al., 2024. Kg4diagnosis: A hierarchical multi-agent LLM framework with knowledge graph enhancement. arXiv preprint arXiv:2404.14510 . Dai et al.:Preprint submitted to ElsevierPage 27 of 27

work page arXiv 2024