Foundation Models vs. Radiomics for Lung Computed Tomography: A Benchmark of Feature Extractors, Classification Heads, and Segmentation Choices

Martin Maurer; Nils Neukirch; Nils Strodthoff

arxiv: 2607.01001 · v1 · pith:Y2OKOFEAnew · submitted 2026-07-01 · 💻 cs.CV · cs.LG

Foundation Models vs. Radiomics for Lung Computed Tomography: A Benchmark of Feature Extractors, Classification Heads, and Segmentation Choices

Nils Neukirch , Martin Maurer , Nils Strodthoff This is my paper

Pith reviewed 2026-07-02 14:01 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords lung cancerCT imagingradiomicsfoundation modelsfeature extractionclassification headstumor segmentationcross-cohort evaluation

0 comments

The pith

Segmentation drives volume and stage tasks while classifier choice drives survival and histology in lung CT phenotyping.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This benchmark isolates the separate contributions of feature extractor, classification head, and segmentation regime across five lung CT tasks using training on one cohort and evaluation on both internal and external cohorts. Segmentation regime emerges as the main performance driver for tumor volume and stage classification, while the choice among heads such as CatBoost or logistic regression drives results on survival, histology, and age prediction. Curia features combined with tumor segmentation and a CatBoost head deliver the highest mean rank across the primary clinical tasks. Radiomics stays competitive on volume and stage tasks, partly because of how labels were derived, and aggregation steps show negligible impact. When tumor outlines are absent, Curia-2 with lung segmentation and logistic regression remains a workable option.

Core claim

The study demonstrates that in controlled two-stage pipelines evaluated on worst-case cross-cohort performance, segmentation choice primarily governs accuracy for tumor volume and stage classification while classification head selection primarily governs accuracy for 2-year survival, histology classification, and age prediction. Curia with tumor segmentation and CatBoost reaches the best average rank across the three main clinical tasks, though per-task selection consistently beats any single default; radiomics performs competitively on volume and stage tasks partly due to label-derivation effects, DINOv3 trails slightly, and patch or slice aggregation adds little value.

What carries the argument

The two-stage pipeline that decouples feature extraction from the classification head, tested across five extractors, seven heads, and three segmentation regimes with worst-case cross-cohort performance as the primary metric.

If this is right

Tumor segmentation should be prioritized for volume and stage tasks regardless of extractor type.
Gradient-boosting heads such as CatBoost improve survival and histology results over linear or tree-based alternatives.
Curia features reach peak scores comparable to radiomics on survival while remaining competitive elsewhere.
Lung segmentation plus logistic regression supplies a practical fallback when tumor delineations are unavailable.
Task-specific selection of head and segmentation outperforms any cross-task default pipeline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Medical imaging pipelines may benefit from endpoint-specific tuning rather than a single recommended configuration.
The benchmark design of isolating extractor, head, and segmentation could be repeated on other cancer types to check whether the same task-dependent pattern appears.
The limited value of aggregation steps suggests that simpler preprocessing pipelines can be used without harming cross-cohort robustness.

Load-bearing premise

Performance differences between design choices mainly reflect the isolated effects of extractor, head, and segmentation rather than unmeasured differences in imaging protocols or label derivation between the two cohorts.

What would settle it

A controlled re-run on cohorts matched for imaging protocol and label source that finds the relative importance of segmentation versus classifier choice reverses or disappears.

Figures

Figures reproduced from arXiv: 2607.01001 by Martin Maurer, Nils Neukirch, Nils Strodthoff.

**Figure 1.** Figure 1: Benchmark overview. For each CT volume we vary three axes: (i) the segmentation provided to the feature extractor (no mask, lung mask or tumor mask); (ii) the feature extractor: a foundation model (Curia, Curia-2, DINOv3) or hand-engineered radiomics (2D, 3D); (iii) the classification head: a tabular foundation model (TabPFN, TabICL), a linear model (logistic regression / Ridge), a bagged tree (Random Fore… view at source ↗

**Figure 2.** Figure 2: LUNG1 vs. LUNG2 performance for all configurations. Each point is one (features, segmentation, patch agg, slice agg, head) configuration; columns are the five tasks; points below the dashed diagonal have a performance drop on the external cohort. The five panels color the same data by different pipeline factors: (a) feature extractor, (b) classification head, (c) segmentation choice, (d) patch aggregation … view at source ↗

**Figure 3.** Figure 3: Best configuration per factor level. Bar length = robust score: min(LUNG1, LUNG2) AUC for classification tasks; max(LUNG1, LUNG2) MAE for age prediction (i.e. worst-case error across cohorts; note the reversed x-axis). Filled circles show the LUNG1 (black) and LUNG2 (gray) scores of that same best-robust configuration; dotted lines indicate the single-cohort best scores (the highest LUNG1 score and highest… view at source ↗

**Figure 4.** Figure 4: Tumor volume: robustness by factor. Metric is AUC; each dot is one configuration; higher is better; thick tick is the median. Radiomics features sit clearly to the right of all other extractors, a direct consequence of the volume label being derived from the same tumor segmentation used by radiomics; foundation-model configurations show larger spread. Segmentation choice has a pronounced effect: the tumor … view at source ↗

**Figure 5.** Figure 5: Tumor stage classification: robustness by factor. Metric is AUC; each dot is one configuration; higher is better; thick tick is the median. Radiomics 2D and 3D achieve the highest median robust scores among feature extractors, with foundation models showing comparable spread but lower medians. Segmentation is the dominant factor: the tumor mask yields a strong advantage (median clearly right of lung-mask a… view at source ↗

**Figure 6.** Figure 6: 2-year survival prediction: robustness by factor. Metric is AUC; each dot is one configuration; higher is better; thick tick is the median. Radiomics achieves the highest median robust score among feature extractors, though foundation models span a wider range and reach higher peak scores. The tumor mask yields the highest median among segmentation choices; lung-mask configurations sit between tumor mask a… view at source ↗

**Figure 7.** Figure 7: Histology type classification: robustness by factor. Metric is AUC; each dot is one configuration; higher is better; thick tick is the median. Curia achieves the highest median robust AUC among feature extractors, with Curia-2 and DINOv3 close behind; radiomics shows lower medians for this task. The three segmentation choices yield similar median scores, with lung segmentation achieving a slightly higher m… view at source ↗

**Figure 8.** Figure 8: Age prediction: robustness by factor. Metric is MAE (years); each dot is one configuration; lower is better, so the x-axis is reversed; thick tick is the median. Ridge regression performs poorly relative to all other heads by a large margin, its median robust MAE substantially exceeds that of tree-based and boosting methods, driving the large classifier Δ of 9.451 years reported in [PITH_FULL_IMAGE:figure… view at source ↗

read the original abstract

Radiomics is the established approach for CT-based lung cancer phenotyping, yet comparisons with foundation models rarely isolate contributions of feature extractor, classification head, and segmentation choice, or test cross-cohort robustness. We benchmark five feature extractors (Curia, Curia-2, DINOv3, Radiomics2D, Radiomics3D), seven classification heads (TabPFN, TabICL, XGBoost, CatBoost, Random Forest, logistic regression, Ridge), and three segmentation regimes on five tasks: tumor volume and stage classification, 2-year survival prediction, histology classification, and age prediction. Models are trained on LUNG1 (n=338) and evaluated on an internal test set (n=84) and the external LUNG2 cohort (n=211), with worst-case cross-cohort performance as the primary metric. The dominant design factor is task-dependent: segmentation drives volume and stage classification, while classifier choice drives survival, histology, and age prediction. Radiomics is competitive for tumor volume, tumor stage and survival (partly due to label-derivation effects for the former); Curia variants reach comparable peak scores for survival; DINOv3 falls slightly short across tasks. Patch and slice aggregation have negligible impact. We recommend Curia with tumor segmentation and a CatBoost head as a safe default, achieving the best mean rank across the three primary clinical tasks, though task-specific selection consistently outperforms any cross-task default. When tumor delineations are unavailable, Curia-2 with lung segmentation and logistic regression offers a competitive alternative. All pipelines use a two-stage design suited to small cohort sizes where end-to-end fine-tuning would risk overfitting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a straightforward empirical benchmark that isolates extractor, head, and segmentation effects on lung CT tasks and gives task-dependent rankings, but cohort shifts between LUNG1 and LUNG2 remain a plausible alternative explanation for the patterns.

read the letter

The main thing to know is that the paper benchmarks five extractors, seven heads, and three segmentations on volume, stage, survival, histology, and age prediction using LUNG1 for training and LUNG2 for external testing, with worst-case cross-cohort performance as the key metric. It finds segmentation matters most for volume and stage while the head drives the other tasks, and recommends Curia plus tumor segmentation plus CatBoost as a safe default.

What is actually new is the specific cross-cohort ranking for this exact combination of components on these two cohorts; earlier comparisons did not break out the three factors this cleanly or report the external worst-case numbers. The work does well by sticking to standard splits, testing recent options like DINOv3 and TabPFN alongside radiomics, and explicitly noting label-derivation effects on the volume and stage results.

The soft spot is the attribution claim. The central pattern could still be driven by unmeasured differences in imaging protocols, reconstruction, or other cohort factors rather than the tested design choices. The paper flags label issues for two tasks but does not run sensitivity checks or controls for the broader set of confounds, so the task-dependent dominance conclusion rests on an assumption that is not fully tested.

This paper is for engineers and researchers who need concrete guidance on lung CT phenotyping pipelines. Readers facing similar small-cohort, multi-task settings will find the rankings and defaults useful. It deserves a serious referee because the setup is reproducible and the question is directly actionable, even if revisions should tighten the discussion of cohort effects.

Referee Report

2 major / 2 minor

Summary. The paper benchmarks five feature extractors (Curia, Curia-2, DINOv3, Radiomics2D, Radiomics3D), seven classification heads, and three segmentation regimes on five lung CT tasks (tumor volume, stage, 2-year survival, histology, age). Models are trained on LUNG1 (n=338) and evaluated on an internal hold-out plus external LUNG2 (n=211) using worst-case cross-cohort performance as the primary metric. It concludes that segmentation dominates volume/stage tasks while classifier choice dominates the others, recommends Curia + tumor segmentation + CatBoost as a safe default (best mean rank on primary tasks), and notes that task-specific selection outperforms any single default.

Significance. If the empirical patterns hold after controlling for cohort effects, the study supplies a practical, task-aware benchmark for small-cohort medical imaging pipelines. The explicit two-stage design, external validation, and worst-case metric are strengths that directly address overfitting risks common in this domain.

major comments (2)

[Abstract and Results] Abstract and Results: the central claim that 'the dominant design factor is task-dependent' and the resulting recommendation rest on the assumption that observed performance gaps isolate the effects of extractor/head/segmentation. The manuscript notes label-derivation effects for volume/stage but provides no quantitative sensitivity analysis or ablation for other cohort-level confounds (imaging protocol, reconstruction parameters) between LUNG1 and LUNG2; this directly affects attribution of the task-specific patterns.
[Results] Results section (performance tables): the paper reports mean ranks and 'best' configurations without mentioning statistical significance testing (paired tests, confidence intervals, or multiple-comparison correction) on the cross-cohort differences. Given that the recommendation and 'dominant factor' statements rely on these rankings, absence of such tests weakens the strength of the comparative claims.

minor comments (2)

[Methods] Methods: the description of how worst-case cross-cohort performance is exactly computed (e.g., whether it is min over the two test sets or a different aggregation) should be stated explicitly with a formula.
[Figures/Tables] Figure captions and tables: ensure all axes and row/column labels explicitly indicate the metric (e.g., AUC, accuracy) and the exact cohort split used for each entry.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback emphasizing robustness to cohort effects and the need for statistical rigor in comparative claims. We address each major comment below.

read point-by-point responses

Referee: [Abstract and Results] Abstract and Results: the central claim that 'the dominant design factor is task-dependent' and the resulting recommendation rest on the assumption that observed performance gaps isolate the effects of extractor/head/segmentation. The manuscript notes label-derivation effects for volume/stage but provides no quantitative sensitivity analysis or ablation for other cohort-level confounds (imaging protocol, reconstruction parameters) between LUNG1 and LUNG2; this directly affects attribution of the task-specific patterns.

Authors: We agree that attributing performance patterns specifically to design choices requires careful consideration of cohort confounds. The manuscript already flags label-derivation effects for volume and stage. The worst-case cross-cohort metric was chosen precisely to reduce sensitivity to cohort-specific artifacts. However, a dedicated quantitative sensitivity analysis for imaging protocol or reconstruction parameters is not present. In revision we will add an explicit limitations paragraph in the Discussion that discusses these potential confounds, notes the absence of detailed protocol metadata in the public LUNG1/LUNG2 releases, and qualifies the strength of the task-dependence claim accordingly. We maintain that the observed patterns are still informative under the two-cohort evaluation design. revision: partial
Referee: [Results] Results section (performance tables): the paper reports mean ranks and 'best' configurations without mentioning statistical significance testing (paired tests, confidence intervals, or multiple-comparison correction) on the cross-cohort differences. Given that the recommendation and 'dominant factor' statements rely on these rankings, absence of such tests weakens the strength of the comparative claims.

Authors: We accept that the absence of statistical testing weakens the comparative statements. In the revised manuscript we will augment the performance tables with paired Wilcoxon signed-rank tests (or equivalent non-parametric tests) on the cross-cohort differences, apply multiple-comparison correction, and report p-values and confidence intervals where appropriate. This will allow readers to evaluate the reliability of the mean-rank differences that underpin the task-specific dominance claims and the recommended default configuration. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmark with direct cross-cohort measurements

full rationale

The paper reports an empirical benchmark of feature extractors, heads, and segmentation choices on LUNG1 training and held-out LUNG2 evaluation cohorts. All claims (task-dependent dominance, recommendations for Curia+CatBoost) rest on tabulated performance metrics and mean ranks computed from those measurements. No equations, fitted parameters, uniqueness theorems, or self-citation chains appear in the derivation of results. The design is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard machine-learning assumptions about data distributions and task definitions without introducing new free parameters or postulated entities.

axioms (2)

domain assumption The LUNG1 and LUNG2 cohorts are sufficiently representative for worst-case cross-cohort performance to indicate real-world robustness.
Primary evaluation metric and generalizability claims rest on this premise.
domain assumption The five tasks are sufficiently distinct to reveal independent effects of segmentation versus classifier choice.
The task-dependent dominance conclusion depends on this separation.

pith-pipeline@v0.9.1-grok · 5855 in / 1479 out tokens · 44110 ms · 2026-07-02T14:01:52.158239+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 6 canonical work pages · 2 internal anchors

[1]

, author Wee, L

author Aerts, H.J.W.L. , author Wee, L. , author Rios Velazquez, E. , author Leijenaar, R.T.H. , author Parmar, C. , author Grossmann, P. , author Carvalho, S. , author Bussink, J. , author Monshouwer, R. , author Haibe-Kains, B. , author Rietveld, D. , author Hoebers, F. , author Rietbergen, M.M. , author Leemans, C.R. , author Dekker, A. , author Quacke...

work page doi:10.7937/k9/tcia.2015.pf0m9rei 2019
[2]

, author Gevaert, O

author Bakr, S. , author Gevaert, O. , author Echegaray, S. , author Ayers, K. , author Zhou, M. , author Shafiq, M. , author Zheng, H. , author Benson, J.A. , author Zhang, W. , author Leung, A.N.C. , author Kadoch, M. , author Hoang, C.D. , author Shrager, J. , author Quon, A. , author Rubin, D.L. , author Plevritis, S.K. , author Napel, S. , year 2018 ...

work page doi:10.1038/sdata.2018.202 2018
[3]

, author Marturano, F

author Braghetto, A. , author Marturano, F. , author Paiusco, M. , author Baiesi, M. , author Bettinelli, A. , year 2022 . title Radiomics and deep learning methods for the prediction of 2-year overall survival in LUNG1 dataset . journal Sci. Rep. volume 12 , pages 14132

2022
[4]

, author Laversanne, M

author Bray, F. , author Laversanne, M. , author Sung, H. , author Ferlay, J. , author Siegel, R.L. , author Soerjomataram, I. , author Jemal, A. , year 2024 . title Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries . journal CA Cancer J. Clin. volume 74 , pages 229--263

2024
[5]

, author Desrosiers, C

author Chaddad, A. , author Desrosiers, C. , author Toews, M. , author Abdulkarim, B. , year 2017 . title Predicting survival time of lung cancer patients using radiomic analysis . journal Oncotarget volume 8 , pages 104393--104407

2017
[6]

, author Guestrin, C

author Chen, T. , author Guestrin, C. , year 2016 . title XGBoost : A scalable tree boosting system , in: booktitle Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , publisher ACM . pp. pages 785--794

2016
[7]

Curia: A multi- modal foundation model for radiology.arXiv preprint arXiv:2509.06830, 2025

author Dancette, C. , author Khlaut, J. , author Saporta, A. , author Philippe, H. , author Ferreres, E. , author Callard, B. , author Danielou, T. , author Alberge, L. , author Machado, L. , author Tordjman, D. , author Dupuis, J. , author Le Floch, K. , author Du Terrail, J. , author Moshiri, M. , author Dercle, L. , author Boeken, T. , author Gregory, ...

work page doi:10.48550/arxiv.2509.06830 2025
[8]

, author Wang, N.K

author Dooms, T. , author Wang, N.K. , author Pearce, M.T. , year 2026 . title Covariance-based sequence pooling . journal Goodfire Research

2026
[9]

, author Weitz, P

author Haarburger, C. , author Weitz, P. , author Rippel, O. , author Merhof, D. , year 2019 . title Image-based survival prediction for lung cancer patients using CNNS , in: booktitle 2019 IEEE 16th International Symposium on Biomedical Imaging ( ISBI 2019) , publisher IEEE

2019
[10]

, author M \"u ller, S

author Hollmann, N. , author M \"u ller, S. , author Eggensperger, K. , author Hutter, F. , year 2023 . title TabPFN : A transformer that solves small tabular classification problems in a second , in: booktitle International Conference on Learning Representations . https://openreview.net/forum?id=cp5PvcI6w8_

2023
[11]

, author Georgescu, B

author Liu, H. , author Georgescu, B. , author Zhang, Y. , author Yoo, Y. , author Baumgartner, M. , author Gao, R. , author Wang, J. , author Zhao, G. , author Gibson, E. , author Comaniciu, D. , author Grbic, S. , year 2026 . title Revisiting 2d foundation models for scalable 3d medical image classification , in: booktitle Proceedings of the IEEE/CVF Co...

2026
[12]

, author Zhang, Y

author Liu, J. , author Zhang, Y. , author Chen, J.N. , author Xiao, J. , author Lu, Y. , author Landman, B.A. , author Yuan, Y. , author Yuille, A. , author Tang, Y. , author Zhou, Z. , year 2023 . title CLIP -driven universal model for organ segmentation and tumor detection , in: booktitle Proceedings of the IEEE/CVF International Conference on Computer...

2023
[13]

, author Bontempi, D

author Pai, S. , author Bontempi, D. , author Hadzic, I. , author Prudente, V. , author Soka c , M. , author Chaunzwa, T.L. , author Bernatz, S. , author Hosny, A. , author Mak, R.H. , author Birkbak, N.J. , author Aerts, H.J.W.L. , year 2024 . title Foundation model for cancer imaging biomarkers . journal Nat. Mach. Intell. volume 6 , pages 354--367

2024
[14]

, author Grossmann, P

author Parmar, C. , author Grossmann, P. , author Bussink, J. , author Lambin, P. , author Aerts, H.J.W.L. , year 2015 . title Machine learning methods for quantitative radiomic biomarkers . journal Sci. Rep. volume 5 , pages 13087

2015
[15]

, author Gusev, G

author Prokhorenkova, L. , author Gusev, G. , author Vorobev, A. , author Dorogush, A.V. , author Gulin, A. , year 2018 . title CatBoost : unbiased boosting with categorical features . journal Advances in Neural Information Processing Systems volume 31

2018
[16]

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

author Qu, J. , author Holzmüller, D. , author Varoquaux, G. , author Morvan, M.L. , year 2025 . title Tabicl: A tabular foundation model for in-context learning on large data . https://arxiv.org/abs/2502.05564, arXiv:2502.05564 http://arxiv.org/abs/2502.05564

work page internal anchor Pith review Pith/arXiv arXiv 2025
[17]

, author Ilioudis, C

author Raptis, S. , author Ilioudis, C. , author Theodorou, K. , year 2024 . title Uncovering the diagnostic power of radiomic feature significance in automated lung cancer detection: An integrative analysis of texture, shape, and intensity contributions . journal BioMedInformatics volume 4 , pages 2400--2425

2024
[18]

Curia-2: Scaling self-supervised learning for radiology foundation models.arXiv preprint arXiv:2604.01987, 2026

author Saporta, A. , author Callard, B. , author Dancette, C. , author Khlaut, J. , author Corbière, C. , author Butsanets, L. , author Prat, A. , author Manceron, P. , year 2026 . title Curia-2: Scaling self-supervised learning for radiology foundation models . https://arxiv.org/abs/2604.01987, arXiv:2604.01987 http://arxiv.org/abs/2604.01987. note arXiv...

work page arXiv 2026
[19]

, author G \'o mez-Flores, W

author Scalco, E. , author G \'o mez-Flores, W. , author Rizzo, G. , year 2024 . title A genetic programming approach to radiomic-based feature construction for survival prediction in non-small cell lung cancer . journal Appl. Sci. (Basel) volume 14 , pages 6923

2024
[20]

, author Zhovannik, I

author Shi, Z. , author Zhovannik, I. , author Traverso, A. , author Dankers, F.J.W.M. , author Deist, T.M. , author Kalendralis, P. , author Monshouwer, R. , author Bussink, J. , author Fijten, R. , author Aerts, H.J.W.L. , author Dekker, A. , author Wee, L. , year 2019 . title Distributed radiomics as a signature validation study using the personal heal...

2019
[21]

DINOv3

author Siméoni, O. , author Vo, H.V. , author Seitzer, M. , author Baldassarre, F. , author Oquab, M. , author Jose, C. , author Khalidov, V. , author Szafraniec, M. , author Yi, S. , author Ramamonjisoa, M. , author Massa, F. , author Haziza, D. , author Wehrstedt, L. , author Wang, J. , author Darcet, T. , author Moutakanni, T. , author Sentana, L. , au...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[22]

, author McIntosh, C

author Welch, M.L. , author McIntosh, C. , author Haibe-Kains, B. , author Milosevic, M.F. , author Wee, L. , author Dekker, A. , author Huang, S.H. , author Purdie, T.G. , author O'Sullivan, B. , author Aerts, H.J.W.L. , author Jaffray, D.A. , year 2019 . title Vulnerabilities of radiomic signature development: The need for safeguards . journal Radiother...

2019
[23]

, author Yang, J

author Yang, L. , author Yang, J. , author Zhou, X. , author Huang, L. , author Zhao, W. , author Wang, T. , author Zhuang, J. , author Tian, J. , year 2019 . title Development of a radiomics nomogram based on the 2D and 3D CT features to predict the survival of non-small cell lung cancer patients . journal Eur. Radiol. volume 29 , pages 2196--2206

2019

[1] [1]

, author Wee, L

author Aerts, H.J.W.L. , author Wee, L. , author Rios Velazquez, E. , author Leijenaar, R.T.H. , author Parmar, C. , author Grossmann, P. , author Carvalho, S. , author Bussink, J. , author Monshouwer, R. , author Haibe-Kains, B. , author Rietveld, D. , author Hoebers, F. , author Rietbergen, M.M. , author Leemans, C.R. , author Dekker, A. , author Quacke...

work page doi:10.7937/k9/tcia.2015.pf0m9rei 2019

[2] [2]

, author Gevaert, O

author Bakr, S. , author Gevaert, O. , author Echegaray, S. , author Ayers, K. , author Zhou, M. , author Shafiq, M. , author Zheng, H. , author Benson, J.A. , author Zhang, W. , author Leung, A.N.C. , author Kadoch, M. , author Hoang, C.D. , author Shrager, J. , author Quon, A. , author Rubin, D.L. , author Plevritis, S.K. , author Napel, S. , year 2018 ...

work page doi:10.1038/sdata.2018.202 2018

[3] [3]

, author Marturano, F

author Braghetto, A. , author Marturano, F. , author Paiusco, M. , author Baiesi, M. , author Bettinelli, A. , year 2022 . title Radiomics and deep learning methods for the prediction of 2-year overall survival in LUNG1 dataset . journal Sci. Rep. volume 12 , pages 14132

2022

[4] [4]

, author Laversanne, M

author Bray, F. , author Laversanne, M. , author Sung, H. , author Ferlay, J. , author Siegel, R.L. , author Soerjomataram, I. , author Jemal, A. , year 2024 . title Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries . journal CA Cancer J. Clin. volume 74 , pages 229--263

2024

[5] [5]

, author Desrosiers, C

author Chaddad, A. , author Desrosiers, C. , author Toews, M. , author Abdulkarim, B. , year 2017 . title Predicting survival time of lung cancer patients using radiomic analysis . journal Oncotarget volume 8 , pages 104393--104407

2017

[6] [6]

, author Guestrin, C

author Chen, T. , author Guestrin, C. , year 2016 . title XGBoost : A scalable tree boosting system , in: booktitle Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , publisher ACM . pp. pages 785--794

2016

[7] [7]

Curia: A multi- modal foundation model for radiology.arXiv preprint arXiv:2509.06830, 2025

author Dancette, C. , author Khlaut, J. , author Saporta, A. , author Philippe, H. , author Ferreres, E. , author Callard, B. , author Danielou, T. , author Alberge, L. , author Machado, L. , author Tordjman, D. , author Dupuis, J. , author Le Floch, K. , author Du Terrail, J. , author Moshiri, M. , author Dercle, L. , author Boeken, T. , author Gregory, ...

work page doi:10.48550/arxiv.2509.06830 2025

[8] [8]

, author Wang, N.K

author Dooms, T. , author Wang, N.K. , author Pearce, M.T. , year 2026 . title Covariance-based sequence pooling . journal Goodfire Research

2026

[9] [9]

, author Weitz, P

author Haarburger, C. , author Weitz, P. , author Rippel, O. , author Merhof, D. , year 2019 . title Image-based survival prediction for lung cancer patients using CNNS , in: booktitle 2019 IEEE 16th International Symposium on Biomedical Imaging ( ISBI 2019) , publisher IEEE

2019

[10] [10]

, author M \"u ller, S

author Hollmann, N. , author M \"u ller, S. , author Eggensperger, K. , author Hutter, F. , year 2023 . title TabPFN : A transformer that solves small tabular classification problems in a second , in: booktitle International Conference on Learning Representations . https://openreview.net/forum?id=cp5PvcI6w8_

2023

[11] [11]

, author Georgescu, B

author Liu, H. , author Georgescu, B. , author Zhang, Y. , author Yoo, Y. , author Baumgartner, M. , author Gao, R. , author Wang, J. , author Zhao, G. , author Gibson, E. , author Comaniciu, D. , author Grbic, S. , year 2026 . title Revisiting 2d foundation models for scalable 3d medical image classification , in: booktitle Proceedings of the IEEE/CVF Co...

2026

[12] [12]

, author Zhang, Y

author Liu, J. , author Zhang, Y. , author Chen, J.N. , author Xiao, J. , author Lu, Y. , author Landman, B.A. , author Yuan, Y. , author Yuille, A. , author Tang, Y. , author Zhou, Z. , year 2023 . title CLIP -driven universal model for organ segmentation and tumor detection , in: booktitle Proceedings of the IEEE/CVF International Conference on Computer...

2023

[13] [13]

, author Bontempi, D

author Pai, S. , author Bontempi, D. , author Hadzic, I. , author Prudente, V. , author Soka c , M. , author Chaunzwa, T.L. , author Bernatz, S. , author Hosny, A. , author Mak, R.H. , author Birkbak, N.J. , author Aerts, H.J.W.L. , year 2024 . title Foundation model for cancer imaging biomarkers . journal Nat. Mach. Intell. volume 6 , pages 354--367

2024

[14] [14]

, author Grossmann, P

author Parmar, C. , author Grossmann, P. , author Bussink, J. , author Lambin, P. , author Aerts, H.J.W.L. , year 2015 . title Machine learning methods for quantitative radiomic biomarkers . journal Sci. Rep. volume 5 , pages 13087

2015

[15] [15]

, author Gusev, G

author Prokhorenkova, L. , author Gusev, G. , author Vorobev, A. , author Dorogush, A.V. , author Gulin, A. , year 2018 . title CatBoost : unbiased boosting with categorical features . journal Advances in Neural Information Processing Systems volume 31

2018

[16] [16]

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

author Qu, J. , author Holzmüller, D. , author Varoquaux, G. , author Morvan, M.L. , year 2025 . title Tabicl: A tabular foundation model for in-context learning on large data . https://arxiv.org/abs/2502.05564, arXiv:2502.05564 http://arxiv.org/abs/2502.05564

work page internal anchor Pith review Pith/arXiv arXiv 2025

[17] [17]

, author Ilioudis, C

author Raptis, S. , author Ilioudis, C. , author Theodorou, K. , year 2024 . title Uncovering the diagnostic power of radiomic feature significance in automated lung cancer detection: An integrative analysis of texture, shape, and intensity contributions . journal BioMedInformatics volume 4 , pages 2400--2425

2024

[18] [18]

Curia-2: Scaling self-supervised learning for radiology foundation models.arXiv preprint arXiv:2604.01987, 2026

author Saporta, A. , author Callard, B. , author Dancette, C. , author Khlaut, J. , author Corbière, C. , author Butsanets, L. , author Prat, A. , author Manceron, P. , year 2026 . title Curia-2: Scaling self-supervised learning for radiology foundation models . https://arxiv.org/abs/2604.01987, arXiv:2604.01987 http://arxiv.org/abs/2604.01987. note arXiv...

work page arXiv 2026

[19] [19]

, author G \'o mez-Flores, W

author Scalco, E. , author G \'o mez-Flores, W. , author Rizzo, G. , year 2024 . title A genetic programming approach to radiomic-based feature construction for survival prediction in non-small cell lung cancer . journal Appl. Sci. (Basel) volume 14 , pages 6923

2024

[20] [20]

, author Zhovannik, I

author Shi, Z. , author Zhovannik, I. , author Traverso, A. , author Dankers, F.J.W.M. , author Deist, T.M. , author Kalendralis, P. , author Monshouwer, R. , author Bussink, J. , author Fijten, R. , author Aerts, H.J.W.L. , author Dekker, A. , author Wee, L. , year 2019 . title Distributed radiomics as a signature validation study using the personal heal...

2019

[21] [21]

DINOv3

author Siméoni, O. , author Vo, H.V. , author Seitzer, M. , author Baldassarre, F. , author Oquab, M. , author Jose, C. , author Khalidov, V. , author Szafraniec, M. , author Yi, S. , author Ramamonjisoa, M. , author Massa, F. , author Haziza, D. , author Wehrstedt, L. , author Wang, J. , author Darcet, T. , author Moutakanni, T. , author Sentana, L. , au...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[22] [22]

, author McIntosh, C

author Welch, M.L. , author McIntosh, C. , author Haibe-Kains, B. , author Milosevic, M.F. , author Wee, L. , author Dekker, A. , author Huang, S.H. , author Purdie, T.G. , author O'Sullivan, B. , author Aerts, H.J.W.L. , author Jaffray, D.A. , year 2019 . title Vulnerabilities of radiomic signature development: The need for safeguards . journal Radiother...

2019

[23] [23]

, author Yang, J

author Yang, L. , author Yang, J. , author Zhou, X. , author Huang, L. , author Zhao, W. , author Wang, T. , author Zhuang, J. , author Tian, J. , year 2019 . title Development of a radiomics nomogram based on the 2D and 3D CT features to predict the survival of non-small cell lung cancer patients . journal Eur. Radiol. volume 29 , pages 2196--2206

2019