pith. sign in

arxiv: 2606.25956 · v2 · pith:5PU72XOKnew · submitted 2026-06-24 · 💻 cs.CV · cs.AI· cs.LG· eess.IV

Pulmonary Embolism Risk Stratification from CTPA and Medical Records: Vascular Graphs Are Not All You Need

Pith reviewed 2026-06-29 04:47 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGeess.IV
keywords pulmonary embolismrisk stratificationCTPAgraph neural networksvascular graphsmedical recordscardiac biomarkerstabular models
0
0 comments X

The pith

Vascular graphs from CTPA add no value to pulmonary embolism risk stratification beyond medical records and cardiac biomarkers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper benchmarks ways to combine medical records, cardiac biomarkers extracted from CTPA scans, and detailed vascular graph data for classifying PE risk levels. It shows that medical records and cardiac biomarkers drive performance among global features, while vascular biomarkers add nothing. Graph neural networks applied directly to the pulmonary vascular tree also fail to beat a strong tabular model that uses only the global features. A sympathetic reader would care because blood tests are frequently unavailable in practice, so confirming that complex vascular analysis brings no gain could simplify real-world decision tools. The work uses a private dataset of 353 cases with unusually complete annotations to reach this conclusion.

Core claim

The central claim is that among global features, medical records and cardiac biomarkers are the most significant predictors for PE risk stratification, vascular biomarkers extracted from CTPA do not further improve results, and even GNNs operating on the vascular tree's intrinsic graph representation fail to outperform a strong tabular baseline on global features alone, which leads the authors to conclude that vascular graphs might hold no discriminative information for this task.

What carries the argument

Direct performance comparison between tabular models trained on global features (medical records plus cardiac biomarkers) and graph neural networks trained on the pulmonary vascular graph extracted from CTPA images.

If this is right

  • Medical records together with cardiac biomarkers extracted from CTPA are sufficient to achieve the observed stratification performance.
  • Vascular biomarkers derived from the same CTPA images provide no measurable gain when added to the global feature set.
  • Graph neural networks on vascular graphs do not exceed the accuracy of tabular models that ignore the vascular structure entirely.
  • Discriminative signals for PE risk reside primarily in systemic patient data rather than in local vascular geometry.
  • Any explanation for the GNN underperformance must address either the model class or properties of the vascular data itself.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the finding holds, clinical pipelines could de-emphasize vascular graph extraction and focus resources on ensuring complete medical records and basic biomarker collection.
  • The same tabular-versus-graph comparison could be run on other vascular or cardiac risk tasks to test whether graph representations are broadly uninformative.
  • Alternative node definitions or edge-weighting schemes for the vascular graph could be tried on this dataset to isolate whether the construction method itself hides useful signals.
  • The result suggests that PE risk may be driven more by global physiological state than by detectable local vascular remodeling visible on standard CTPA.

Load-bearing premise

The GNN architectures and vascular graph construction methods used are adequate to extract discriminative information from the vascular tree if any such information exists in the data.

What would settle it

A different GNN architecture or vascular graph construction method that achieves statistically higher accuracy than the tabular baseline on the same 353-case dataset would falsify the claim that vascular graphs hold no discriminative information.

Figures

Figures reproduced from arXiv: 2606.25956 by Allan Serva, Johannes F. Lutzeyer, Laurent Bertoletti, Morgane des Ligneris, Nathan Painchaud, Odyss\'ee Merveille, Pierre Croisille, Thomas Lampert, Tristan Hab\'emont.

Figure 1
Figure 1. Figure 1: Proposed pipeline for PE risk stratification, including CTPA preprocessing. We first preprocess high-dimensional 3D CTPA by segmenting structures of interest for PE, i.e., ventricles and pulmonary vasculature. The latter is then converted into a graph. Finally, cardiac and vascular biomarkers are computed from the ventricle masks and vascular graph, respectively (see Section 2.1). The stratification model … view at source ↗
Figure 2
Figure 2. Figure 2: Schemas summarizing benchmarked methods for combining global features and vascular graphs for patient risk classification. Some layers (e.g., residual connections, norm, dropout) are not shown for readability. and corrections to ensure topological consistency (bottom left in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

Risk stratification for pulmonary embolism (PE) is critical for clinical decision-making. Stratification guidelines are based on patient medical records, parameters measured from computed tomography pulmonary angiography (CTPA), and blood tests. However, blood tests are often missing in routine practice. This work studies whether state-of-the-art models can accurately classify risk stratification from only medical records and biomarkers extracted from CTPA images. We benchmark different approaches to combine medical records and cardiac biomarkers with rich pulmonary vascular information; we add vascular biomarkers to tabular models and apply graph neural networks (GNNs) on the vascular tree's intrinsic graph representation. We use a private dataset (n=353) with uniquely complete data for PE risk stratification. Our results show that, among global features, medical records and cardiac biomarkers are the most significant predictors, while vascular biomarkers do not further improve stratification. Even more surprising, even GNNs on vascular graphs fail to outperform strong tabular baseline on global features. We consider hypotheses, on both models and data, that could explain this suboptimal performance. Our investigation suggests that, counter-intuitively, vascular graphs might hold no discriminative information for PE risk stratification. Code is available from https://github.com/creatis-myriad/GENESIS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript reports an empirical benchmark on PE risk stratification using a private dataset of 353 patients with complete CTPA and medical record data. It compares tabular models using medical records, cardiac biomarkers, and vascular biomarkers against GNNs applied to vascular tree graphs derived from CTPA segmentations. The central claim is that medical records and cardiac biomarkers are the strongest predictors, vascular biomarkers add no value, and GNNs on vascular graphs do not outperform strong tabular baselines on global features, leading to the suggestion that vascular graphs hold no discriminative information for this task. Hypotheses on models and data are considered, and code is released.

Significance. If the negative result on vascular graphs is robust, the work would indicate that graph-based representations of pulmonary vasculature from CTPA do not capture additional risk-stratification signal beyond standard clinical tabular features, which could simplify model design in this clinical domain and question the added value of GNNs here. Credit is due for releasing code and for explicitly enumerating hypotheses to explain the observed performance gap.

major comments (2)
  1. [Abstract] Abstract: The claim that 'vascular graphs might hold no discriminative information for PE risk stratification' is a direct inference from GNNs failing to beat tabular baselines. This interpretation is load-bearing on the assumption that the chosen vascular graph construction (node/edge definitions from CTPA segmentations) and GNN message-passing are expressive enough to recover any existing vascular predictors (e.g., embolus location, branching topology, or diameter changes) if present in the data. No positive control (synthetic recovery test or known vascular signal) is described to validate this expressiveness.
  2. [Results/Discussion] Results/Discussion: The manuscript notes that hypotheses on models and data were considered, yet the reported experiments appear to lack ablation details on GNN depth, aggregation functions, or alternative graph constructions that could isolate whether the negative result stems from insufficient model capacity versus true absence of signal. This weakens the support for the central negative claim.
minor comments (1)
  1. [Methods] The private dataset precludes full external verification of splits and preprocessing; while code release helps, additional reporting of performance error bars, exact data splits, and full ablation tables would improve clarity of the empirical claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our empirical benchmark. We address the two major comments point by point below, agreeing that additional validation would strengthen the central negative claim on vascular graphs.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that 'vascular graphs might hold no discriminative information for PE risk stratification' is a direct inference from GNNs failing to beat tabular baselines. This interpretation is load-bearing on the assumption that the chosen vascular graph construction (node/edge definitions from CTPA segmentations) and GNN message-passing are expressive enough to recover any existing vascular predictors (e.g., embolus location, branching topology, or diameter changes) if present in the data. No positive control (synthetic recovery test or known vascular signal) is described to validate this expressiveness.

    Authors: We agree that an explicit positive control would provide stronger support for the claim that the negative result reflects absence of signal rather than insufficient model expressiveness. The manuscript tested multiple standard GNN architectures (GCN, GAT, GraphSAGE) known to be expressive on tree-structured graphs, but did not include a synthetic recovery test. In the revision we will add such a control: we will inject synthetic vascular predictors (e.g., simulated embolus locations and diameter changes) into the graphs and verify that the GNNs recover the injected signal above chance, thereby validating that the chosen construction and message-passing are capable of detecting vascular information if present. revision: yes

  2. Referee: [Results/Discussion] Results/Discussion: The manuscript notes that hypotheses on models and data were considered, yet the reported experiments appear to lack ablation details on GNN depth, aggregation functions, or alternative graph constructions that could isolate whether the negative result stems from insufficient model capacity versus true absence of signal. This weakens the support for the central negative claim.

    Authors: We performed internal ablations on GNN depth and aggregation during development and reported the best-performing configurations, but these details were condensed in the main text and only partially shown in the supplement. We agree that fuller reporting is warranted to isolate capacity versus signal absence. In the revised manuscript we will expand both the main text and supplementary material with systematic ablations covering GNN depth (2–6 layers), aggregation functions (mean, max, sum, attention), and alternative graph constructions (edge features for vessel diameter, different node definitions based on branching points). These additions will directly address whether the performance gap persists across model variations. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmarking on held-out data

full rationale

The paper reports direct empirical comparisons of tabular models, vascular biomarkers, and GNNs on vascular graphs using a private dataset (n=353) with held-out evaluation. No mathematical derivations, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or described methods. The central claim (vascular graphs hold no discriminative information) follows from observed performance gaps on external splits rather than any self-definitional reduction or ansatz smuggled via prior work. This matches the default expectation of a non-circular empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard machine learning assumptions about data representativeness and model capacity rather than new free parameters or invented entities.

axioms (1)
  • domain assumption The private dataset of 353 patients with complete PE risk data is representative of the broader population and free of selection bias that would mask vascular graph information.
    The generalization that vascular graphs hold no discriminative information depends on this assumption about the dataset.

pith-pipeline@v0.9.1-grok · 5800 in / 1331 out tokens · 35251 ms · 2026-06-29T04:47:40.976074+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 24 canonical work pages · 1 internal anchor

  1. [1]

    Sensors21(14), 4758 (Jan 2021)

    Ahmedt-Aristizabal, D., et al.: Graph-Based Deep Learning for Medical Diagno- sis and Analysis: Past, Present and Future. Sensors21(14), 4758 (Jan 2021). https://doi.org/10.3390/s21144758

  2. [2]

    & Koyama, M

    Akiba, T., et al.: Optuna: A Next-generation Hyperparameter Op- timization Framework. In: Proc. KDD. pp. 2623–2631 (2019). https://doi.org/10.1145/3292500.3330701

  3. [3]

    In: Learning on Graphs (LoG) (2023),https://openreview.net/forum? id=aisVQy6R2k

    Brasoveanu, A.D., et al.: Extending Graph Neural Networks with Global Fea- tures. In: Learning on Graphs (LoG) (2023),https://openreview.net/forum? id=aisVQy6R2k

  4. [4]

    ICLR (2021),https://openreview.net/forum?id=F72ximsx7C1

    Brody, S., Alon, U., Yahav, E.: How Attentive are Graph Attention Networks? In: Proc. ICLR (2021),https://openreview.net/forum?id=F72ximsx7C1

  5. [5]

    XGBoost: A Scalable Tree Boosting System

    Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System. In: Proc. KDD. pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785

  6. [6]

    Nat Commun 16(1), 2262 (Mar 2025)

    Chu, Y., et al.: Deep learning-driven pulmonary artery and vein segmentation reveals demography-associated vasculature anatomical differences. Nat Commun 16(1), 2262 (Mar 2025). https://doi.org/10.1038/s41467-025-56505-6

  7. [7]

    Radiology: Artificial Intelligence3(2), e200254 (Mar 2021)

    Colak, Errol et al., F.: The RSNA Pulmonary Embolism CT Dataset. Radiology: Artificial Intelligence3(2), e200254 (Mar 2021). https://doi.org/10.1148/ryai.2021200254

  8. [8]

    Diagnostic and Interventional Imaging105(3), 97–103 (Mar 2024)

    Djahnine, A., et al.: Detection and severity quantification of pulmonary em- bolism with 3D CT data using an automated deep learning-based artificial solution. Diagnostic and Interventional Imaging105(3), 97–103 (Mar 2024). https://doi.org/10.1016/j.diii.2023.09.006

  9. [9]

    Journal of Machine Learning Research24(43), 1–48 (2023)

    Dwivedi, V.P., et al.: Benchmarking Graph Neural Networks. Journal of Machine Learning Research24(43), 1–48 (2023)

  10. [10]

    In: Proc

    Fey, M., et al.: PyG 2.0: Scalable Learning on Real World Graphs. In: Proc. TGL Workshop @ KDD (2025),https://openreview.net/forum?id=DHHLkQvWqs

  11. [11]

    In: Proc

    Gorishniy, Y., et al.: Revisiting Deep Learning Models for Tabular Data. In: Proc. NeurIPS. pp. 18932–18943 (2021) 10 N. Painchaud et al

  12. [12]

    Image Vis Comput158, 105509 (May 2025)

    Guarrasi, V., et al.: A systematic review of intermediate fusion in multimodal deep learning for biomedical applications. Image Vis Comput158, 105509 (May 2025). https://doi.org/10.1016/j.imavis.2025.105509

  13. [13]

    Song, C.A

    Hollmann, N., et al.: Accurate predictions on small data with a tabular foundation model. Nature637(8045), 319–326 (Jan 2025). https://doi.org/10.1038/s41586- 024-08328-6

  14. [14]

    In: Proc

    Hu, W., et al.: Open Graph Benchmark: Datasets for Machine Learning on Graphs. In: Proc. NeurIPS. pp. 22118–22133 (2020)

  15. [15]

    Nat Methods18(2), 203–211 (Feb 2021)

    Isensee, F., et al.: nnU-Net: a self-configuring method for deep learning- based biomedical image segmentation. Nat Methods18(2), 203–211 (Feb 2021). https://doi.org/10.1038/s41592-020-01008-z

  16. [16]

    Arch Intern Med170(15), 1383–1389 (Aug 2010)

    Jiménez, D., RIETE Investigators, et al.: Simplification of the Pulmonary Embolism Severity Index for Prognostication in Patients With Acute Symp- tomatic Pulmonary Embolism. Arch Intern Med170(15), 1383–1389 (Aug 2010). https://doi.org/10.1001/archinternmed.2010.199

  17. [17]

    IEEE Rev Biomed Eng pp

    Khan, W., et al.: A Comprehensive Survey of Foundation Mod- els in Medicine. IEEE Rev Biomed Eng pp. 1–20 (2025). https://doi.org/10.1109/RBME.2025.3531360

  18. [18]

    In: Proc

    Kipf, T.N., Welling, M.: Semi-Supervised Classification with Graph Convolutional Networks. In: Proc. ICLR (2017),https://openreview.net/forum?id=SJU4ayYgl

  19. [19]

    European Heart Journal41(4), 543–603 (Jan 2020)

    Konstantinides, S.V., ESC Scientific Document Group, et al.: 2019 ESC Guide- lines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS). European Heart Journal41(4), 543–603 (Jan 2020). https://doi.org/10.1093/eurheartj/ehz405

  20. [20]

    des Ligneris, M., et al.: A Patient-Specific Pulmonary Arterial Tree Digital Twin to Extract Pulmonary Embolism Biomarkers (May 2026),https://arxiv.org/abs/ 2605.28217v1, Under review

  21. [21]

    Sci Data12(1), 1806 (Nov 2025)

    Liu, J., et al.: A Custom Annotated Dataset for Segmentation of Pul- monary Veins, Arteries, and Airways. Sci Data12(1), 1806 (Nov 2025). https://doi.org/10.1038/s41597-025-06074-6

  22. [22]

    IEEE Trans Pattern Anal Mach Intell47(6), 5023–5044 (Jun 2025)

    Liu, J., et al.: Graph Foundation Models: Concepts, Opportunities and Chal- lenges. IEEE Trans Pattern Anal Mach Intell47(6), 5023–5044 (Jun 2025). https://doi.org/10.1109/TPAMI.2025.3548729

  23. [23]

    Sci Rep11(1), 14909 (Jul 2021)

    Liu, X., et al.: Joint analysis of D-dimer, N-terminal pro b-type natriuretic pep- tide, and cardiac troponin I on predicting acute pulmonary embolism relapse and mortality. Sci Rep11(1), 14909 (Jul 2021). https://doi.org/10.1038/s41598-021- 94346-7

  24. [24]

    https://doi.org/10.48550/arXiv.2304.03708

    Luo, G., et al.: Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge (Aug 2024). https://doi.org/10.48550/arXiv.2304.03708

  25. [25]

    ACS Omega (Nov 2025)

    Lutchyn, T., Mardal, M., Ricaud, B.: Efficient Learning of Molecular Properties Using Graph Neural Networks Enhanced with Chemistry Knowledge. ACS Omega (Nov 2025). https://doi.org/10.1021/acsomega.5c07178

  26. [26]

    Eur Radiol 13(1), 29–35 (Jan 2003)

    Mastora, I., et al.: Severity of acute pulmonary embolism: evaluation of a new spi- ral CT angiographic score in correlation with echocardiographic data. Eur Radiol 13(1), 29–35 (Jan 2003). https://doi.org/10.1007/s00330-002-1515-y

  27. [27]

    Graph Classification via Deep Learning with Virtual Nodes

    Pham, T., et al.: Graph Classification via Deep Learning with Virtual Nodes (Aug 2017). https://doi.org/10.48550/arXiv.1708.04357

  28. [28]

    American Journal of Roentgenology176(6), 1415–1420 (Jun 2001)

    Qanadli,S.D.,etal.:NewCTIndextoQuantifyArterialObstructioninPulmonary Embolism. American Journal of Roentgenology176(6), 1415–1420 (Jun 2001). https://doi.org/10.2214/ajr.176.6.1761415 Pulmonary Embolism Risk Stratification from CTPA and Medical Records 11

  29. [29]

    In: Proc

    Rampášek, L., et al.: Recipe for a General, Powerful, Scalable Graph Transformer. In: Proc. NeurIPS. pp. 14501–14515 (2022)

  30. [30]

    Sci Rep11(1), 15814 (Aug 2021)

    Soffer, S., et al.: Deep learning for pulmonary embolism detection on computed tomography pulmonary angiogram: a systematic review and meta-analysis. Sci Rep11(1), 15814 (Aug 2021). https://doi.org/10.1038/s41598-021-95249-3

  31. [31]

    In: Proc

    Stym-Popper, J., et al.: DAFTED: Decoupled Asymmetric Fusion of Tabular and Echocardiographic Data for Cardiac Hypertension Diagnosis. In: Proc. MIDL (2025),https://openreview.net/forum?id=ghhGImwv07

  32. [32]

    In: Proc

    Sun, X., et al.: Graph Neural Network based Future Clinical Events Pre- diction from Invasive Coronary Angiography. In: Proc. ISBI. pp. 1–5 (2024). https://doi.org/10.1109/ISBI56570.2024.10635813

  33. [33]

    Radiology: Artificial Intelligence5(5), e230024 (Sep 2023)

    Wasserthal, J., et al.: TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. Radiology: Artificial Intelligence5(5), e230024 (Sep 2023). https://doi.org/10.1148/ryai.230024

  34. [34]

    ICLR (2018), https://openreview.net/forum?id=ryGs6iA5Km

    Xu, K., et al.: How Powerful are Graph Neural Networks? In: Proc. ICLR (2018), https://openreview.net/forum?id=ryGs6iA5Km

  35. [35]

    IEEE Trans Pattern Anal Mach Intell45(10), 12113–12132 (Oct 2023)

    Xu, P., Zhu, X., Clifton, D.A.: Multimodal Learning With Transformers: A Sur- vey. IEEE Trans Pattern Anal Mach Intell45(10), 12113–12132 (Oct 2023). https://doi.org/10.1109/TPAMI.2023.3275156