CSV-ViT: A Vision Transformer with the Variable-sized Cortical Supervertices for Detection of Alzheimer's Disease Pathologies

Geonwoo Baek; Ikbeom Jang

arxiv: 2605.26514 · v1 · pith:LXT524HWnew · submitted 2026-05-26 · 💻 cs.CV · cs.AI· cs.LG

CSV-ViT: A Vision Transformer with the Variable-sized Cortical Supervertices for Detection of Alzheimer's Disease Pathologies

Geonwoo Baek , Ikbeom Jang This is my paper

Pith reviewed 2026-06-29 18:00 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG

keywords Alzheimer's diseasecortical surfacevision transformersuperverticesMRIamyloid positivitytau positivitysurface-based models

0 comments

The pith

A Vision Transformer using variable-sized cortical supervertices outperforms prior surface models on MRI classification of Alzheimer's disease, amyloid positivity, and tau positivity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CSV-ViT, which tokenizes cortical surfaces from T1-weighted MRI into variable-sized patches called cortical supervertices that preserve regions of interest and exclude non-cortical areas. It adapts the Vision Transformer architecture with padding and mask-aware patch embedding to handle these variable patches on spherical data. The approach is tested on three classification tasks: Alzheimer's disease diagnosis, amyloid positivity, and tau positivity. A sympathetic reader would care because the method uses widely available structural MRI for prescreening instead of costly and invasive PET scans. Experiments show the model reaches higher classification performance than recent surface-based alternatives.

Core claim

The paper claims that its cortical surface tokenization, which performs ROI-preserving vertex-based variable-sized patch partitioning into cortical supervertices, combined with a padding-tolerant and mask-aware Vision Transformer, produces higher classification performance than recent surface-based models when applied to T1-weighted MRI for AD diagnosis, amyloid positivity, and tau positivity.

What carries the argument

Cortical supervertices (CSVs), defined as ROI-preserving, vertex-based, variable-sized patches obtained by partitioning the cortical surface, together with padding and mask-aware patch embedding that allows the Vision Transformer to process variable patch sizes without boundary duplication or information loss.

If this is right

The model supports MRI-based prediction of AD-related status prior to PET or CSF confirmation.
Variable-sized partitioning avoids inclusion of non-cortical regions and duplicate vertices at patch boundaries.
The framework achieves higher classification performance across AD diagnosis, amyloid positivity, and tau positivity tasks.
The tokenization enables learning directly from the spherical topology of brain cortical surfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same supervertex tokenization could be tested on other cortical-surface classification problems such as additional neurodegenerative conditions.
Combining CSV-ViT predictions with PET data in a multi-modal setting might further improve specificity for amyloid and tau status.
Systematic variation of supervertex size distributions could reveal an optimal granularity for different diagnostic targets.

Load-bearing premise

The variable-sized cortical supervertex partitioning preserves the region of interest without including non-cortical regions such as the medial wall, and the padding plus mask-aware patch embedding does not degrade information or introduce artifacts that affect classification.

What would settle it

A direct replication on the same T1-weighted MRI datasets and tasks in which CSV-ViT fails to exceed the classification accuracy of the compared recent surface-based models on any of the three targets.

Figures

Figures reproduced from arXiv: 2605.26514 by Geonwoo Baek, Ikbeom Jang.

**Figure 1.** Figure 1: illustrates the overall framework, from cortical surface partitioning to classification using CSV-ViT [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Difference of Patch across ViT, SiT, and CSV-ViT [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Confirming Alzheimer's disease (AD) typically relies on positron emission tomography (PET), which remains costly and invasive, motivating the use of structural MRI-based prescreening. Deep learning on non-Euclidean manifolds, particularly brain cortical surfaces, faces significant challenges due to the data's spherical topology. Recent surface models have enabled learning from cortical surface data; however, imposing face-based uniform patches often causes duplicate vertices at patch boundaries. In general, many surface-based models are limited in their awareness of the region of interest (ROI), which can result in non-cortical regions, such as the medial wall, being included. We propose a cortical surface tokenization that performs ROI-preserving, vertex-based, variable-sized patch partitioning. We refer to these cortical surface patches as cortical supervertices (CSVs). Building on this representation, we design the CSV Vision Transformer (CSV-ViT), a variable-size patch-tolerant Vision Transformer that uses padding and a mask-aware patch embedding. We used T1-weighted MRI and evaluated our framework by classifying AD-related status into three categories: AD diagnosis, amyloid positivity, and tau positivity. Across the experiments, CSV-ViT achieved higher classification performance than recent surface-based models. The results suggest that the proposed CSV-ViT may support MRI-based prediction of AD-related status prior to PET or CSF confirmation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CSV-ViT's main move is a vertex-based variable-sized supervertex tokenization that keeps patches inside the cortical ROI, paired with padding and mask-aware embedding so a standard ViT can process them.

read the letter

The punchline is that the paper gives a concrete fix for two recurring headaches in surface-based models: uniform face patches that duplicate vertices at boundaries and methods that leak in non-cortical regions like the medial wall. The CSV partitioning is vertex-driven and allows different patch sizes while staying ROI-aware, and CSV-ViT adds the minimal changes needed to let the transformer ignore padded areas.

What the work does cleanly is spell out those prior limitations and then describe a tokenization plus embedding scheme that directly targets them. The three-task evaluation on T1 MRI for AD diagnosis, amyloid positivity, and tau positivity is a sensible test bed for the prescreening use case.

The soft spots are in the empirical side. The abstract states higher performance than recent surface models, but the magnitude, variance, and controls are not visible here, so it is hard to tell how much the new tokenization actually moves the needle versus other design choices. The central assumption that the variable patches and masking preserve signal without new artifacts needs the full results and ablations to hold up; the stress-test found no internal contradictions in the method description, which is reassuring but still leaves the quantitative validation as the load-bearing part.

This paper is for people already working on cortical surface representations or medical ViTs who need a practical way to avoid ROI leakage. A reader who follows surface-based AD work would get the most out of the specific adaptations.

Send it to peer review. The technical adaptation is narrow but well-motivated, and the claim is testable once the numbers and controls are examined.

Referee Report

0 major / 3 minor

Summary. The paper proposes CSV-ViT, a Vision Transformer for cortical surface data from T1-weighted MRI that tokenizes the surface using ROI-preserving, vertex-based, variable-sized cortical supervertices (CSVs). It introduces padding and mask-aware patch embedding to handle variable patch sizes in the ViT, and evaluates the model on three binary classification tasks: AD diagnosis, amyloid positivity, and tau positivity. The central claim is that CSV-ViT outperforms recent surface-based models on these tasks.

Significance. If the performance gains hold under full validation, the work offers a practical advance in non-invasive MRI-based prescreening for AD-related biomarkers, addressing known difficulties with uniform face-based patching and medial-wall inclusion on spherical cortical surfaces. The explicit handling of variable-sized, ROI-preserving supervertices and mask-aware embedding is a targeted technical contribution to manifold learning on brain surfaces.

minor comments (3)

[Methods] The abstract and method sections would benefit from explicit dataset sizes, train/validation/test splits, and subject-level cross-validation details to allow direct comparison with prior surface-based baselines.
[Experiments] Figure captions and the experimental results section should report exact AUC, accuracy, and F1 values with confidence intervals or standard deviations across runs rather than qualitative statements of 'higher performance'.
[Section 3.1] The description of the supervertex partitioning algorithm would be clearer with a short pseudocode block or explicit reference to the number of supervertices per hemisphere and the stopping criterion used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. We appreciate the recognition of the technical contributions regarding ROI-preserving variable-sized cortical supervertices and the mask-aware ViT embedding.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript presents an empirical architecture proposal (CSV tokenization plus mask-aware ViT) whose central claim is comparative classification accuracy on three AD-related tasks from T1 MRI. No equations, derivations, or parameter-fitting steps appear that could reduce a claimed prediction to a self-defined input. The method description relies on standard ViT components with added padding/masking; performance is reported via direct experiment rather than any self-referential construction. No load-bearing self-citations or uniqueness theorems are invoked. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; all details on implementation and evaluation are absent.

pith-pipeline@v0.9.1-grok · 5780 in / 962 out tokens · 25673 ms · 2026-06-29T18:00:04.372014+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 16 canonical work pages · 2 internal anchors

[1]

URL https://doi

Dahan, S., Fawaz, A., Williams, L.Z., Yang, C., et al.: Surface Vision Transformers: Attention-Based Modelling applied to Cortical Analysis. In: Medical Imaging with Deep Learning (MIDL). pp. 282–303 (2022).https://doi.org/10.48550/arXiv. 2203.16414

work page internal anchor Pith review doi:10.48550/arxiv 2022
[2]

ACM Trans

Sharp, N., Attaiki, S., Crane, K., et al.: DiffusionNet: Discretization agnos- tic learning on surfaces. ACM Trans. Graph.41(3), 27:1–27:16 (2022).https: //doi.org/10.1145/3507905

work page doi:10.1145/3507905 2022
[3]

Dahan,S.,Williams,L.Z.,etal.:TheMultiscaleSurfaceVisionTransformer.arXiv preprintarXiv:2303.11909(2024).https://doi.org/10.48550/arXiv.2303.11909

work page doi:10.48550/arxiv.2303.11909 2024
[4]

Medical Image Analysis107, 103793 (2025).https://doi.org/10.1016/j.media.2025.103793

Li, Z., Zhang, J., et al.: SurfGNN: A robust surface-based prediction model with in- terpretability for coactivation maps of spatial and cortical features. Medical Image Analysis107, 103793 (2025).https://doi.org/10.1016/j.media.2025.103793

work page doi:10.1016/j.media.2025.103793 2025
[5]

Cerebral Cortex14(1), 11–22 (2004).https://doi.org/10.1093/ cercor/bhg087

Fischl, B., Van Der Kouwe, A., et al.: Automatically parcellating the human cerebral cortex. Cerebral Cortex14(1), 11–22 (2004).https://doi.org/10.1093/ cercor/bhg087

2004
[6]

Cassola, M

Johnson, K.A., Sperling, R.A., Gidicsin, C.M., Carmasin, J.S., Maye, J.E., Cole- man, R.E., et al.: Florbetapir (F18-AV-45) PET to assess amyloid burden in Alzheimer’s disease dementia, mild cognitive impairment, and normal aging. Alzheimers Dement.9(5 Suppl), S72–S83 (2013).https://doi.org/10.1016/j. jalz.2012.10.007

work page doi:10.1016/j 2013
[7]

Ann Neurol.78(5), 787–800 (2015).https://doi.org/10.1002/ana.24517

Marquié, M., Normandin, M.D., Vanderburg, C.R., Costantino, I.M., Bien, E.A., Rycyna, L.G., et al.: Validating novel tau positron emission tomography tracer [F-18]-AV-1451 (T807) on postmortem brain tissue. Ann Neurol.78(5), 787–800 (2015).https://doi.org/10.1002/ana.24517

work page doi:10.1002/ana.24517 2015
[8]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021). https://doi.org/10.48550/arXiv.2010.11929

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.11929 2021
[9]

Karypis, G., Kumar, V.: A Fast and High Quality Multilevel Scheme for Parti- tioning Irregular Graphs. SIAM J. Sci. Comput.20(1), 359–392 (1998).https: //doi.org/10.1137/S1064827595287997

work page doi:10.1137/s1064827595287997 1998
[10]

Neurology74(3), 201–209 (2010)

Petersen, R.C., Aisen, P.S., Beckett, L.A., et al.: Alzheimer’s Disease Neuroimag- ing Initiative (ADNI): clinical characterization. Neurology74(3), 201–209 (2010). https://doi.org/10.1212/WNL.0b013e3181cb3e25

work page doi:10.1212/wnl.0b013e3181cb3e25 2010
[11]

Marcus, D.S., Fotenos, A.F., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open Access Series of Imaging Studies: Longitudinal MRI Data in Nondemented and Demented Older Adults. J. Cogn. Neurosci.22(12), 2677–2684 (2010).https: //doi.org/10.1162/jocn.2009.21407

work page doi:10.1162/jocn.2009.21407 2010
[12]

Alzheimers Dement.7(3), 270–279 (2011).https: //doi.org/10.1016/j.jalz.2011.03.008 10 G

Albert, M.S., DeKosky, S.T., Dickson, D., et al.: The diagnosis of mild cog- nitive impairment due to Alzheimer’s disease: Recommendations from the Na- tional Institute on Aging–Alzheimer’s Association workgroups on diagnostic guide- lines for Alzheimer’s disease. Alzheimers Dement.7(3), 270–279 (2011).https: //doi.org/10.1016/j.jalz.2011.03.008 10 G. Baek et al

work page doi:10.1016/j.jalz.2011.03.008 2011
[13]

Bondi, M.W., Edmonds, E.C., Jak, A.J., et al.: Neuropsychological criteria for mild cognitive impairment improves diagnostic precision, biomarker associations, and progression rates. J. Alzheimers Dis.42(1), 275–289 (2014).https://doi. org/10.3233/JAD-140276

work page doi:10.3233/jad-140276 2014
[14]

Alzheimers Dement.10(5), 511–521.e1 (2014).https://doi.org/10

Nettiksimmons, J., DeCarli, C., Landau, S., Beckett, L., Alzheimer’s Disease Neu- roimaging Initiative: Biological heterogeneity in ADNI amnestic mild cognitive impairment. Alzheimers Dement.10(5), 511–521.e1 (2014).https://doi.org/10. 1016/j.jalz.2013.09.003

2014
[15]

Alzheimers Dement (Amst)10, 245–252 (2018)

Gordon, B.A., McCullough, A., Mishra, S., et al.: Cross-sectional and longitudi- nal atrophy is preferentially associated with tau rather than amyloidβpositron emission tomography pathology. Alzheimers Dement (Amst)10, 245–252 (2018). https://doi.org/10.1016/j.dadm.2018.02.003

work page doi:10.1016/j.dadm.2018.02.003 2018
[16]

Neurology92(6), e601–e612 (2019).https://doi.org/10.1212/WNL.0000000000006875

Ossenkoppele, R., Smith, R., Ohlsson, T., et al.: Associations between tau, Aβ, and cortical thickness with cognition in Alzheimer disease. Neurology92(6), e601–e612 (2019).https://doi.org/10.1212/WNL.0000000000006875

work page doi:10.1212/wnl.0000000000006875 2019
[17]

Alzheimers Dement.17, 1085–1096 (2021).https://doi.org/10.1002/alz.12249

Harrison, T.M., Du, R., Klencklen, G., Baker, S.L., Jagust, W.J.: Distinct effects of beta-amyloid and tau on cortical thickness in cognitively healthy older adults. Alzheimers Dement.17, 1085–1096 (2021).https://doi.org/10.1002/alz.12249

work page doi:10.1002/alz.12249 2021
[18]

Radiology

Lew, C.O., Zhou, L., et al.: MRI-based Deep Learning Assessment of Amyloid, Tau, and Neurodegeneration Biomarker Status across the Alzheimer Disease Spectrum. Radiology. 309(1), e222441 (2023).https://doi.org/10.1148/radiol.222441

work page doi:10.1148/radiol.222441 2023

[1] [1]

URL https://doi

Dahan, S., Fawaz, A., Williams, L.Z., Yang, C., et al.: Surface Vision Transformers: Attention-Based Modelling applied to Cortical Analysis. In: Medical Imaging with Deep Learning (MIDL). pp. 282–303 (2022).https://doi.org/10.48550/arXiv. 2203.16414

work page internal anchor Pith review doi:10.48550/arxiv 2022

[2] [2]

ACM Trans

Sharp, N., Attaiki, S., Crane, K., et al.: DiffusionNet: Discretization agnos- tic learning on surfaces. ACM Trans. Graph.41(3), 27:1–27:16 (2022).https: //doi.org/10.1145/3507905

work page doi:10.1145/3507905 2022

[3] [3]

Dahan,S.,Williams,L.Z.,etal.:TheMultiscaleSurfaceVisionTransformer.arXiv preprintarXiv:2303.11909(2024).https://doi.org/10.48550/arXiv.2303.11909

work page doi:10.48550/arxiv.2303.11909 2024

[4] [4]

Medical Image Analysis107, 103793 (2025).https://doi.org/10.1016/j.media.2025.103793

Li, Z., Zhang, J., et al.: SurfGNN: A robust surface-based prediction model with in- terpretability for coactivation maps of spatial and cortical features. Medical Image Analysis107, 103793 (2025).https://doi.org/10.1016/j.media.2025.103793

work page doi:10.1016/j.media.2025.103793 2025

[5] [5]

Cerebral Cortex14(1), 11–22 (2004).https://doi.org/10.1093/ cercor/bhg087

Fischl, B., Van Der Kouwe, A., et al.: Automatically parcellating the human cerebral cortex. Cerebral Cortex14(1), 11–22 (2004).https://doi.org/10.1093/ cercor/bhg087

2004

[6] [6]

Cassola, M

Johnson, K.A., Sperling, R.A., Gidicsin, C.M., Carmasin, J.S., Maye, J.E., Cole- man, R.E., et al.: Florbetapir (F18-AV-45) PET to assess amyloid burden in Alzheimer’s disease dementia, mild cognitive impairment, and normal aging. Alzheimers Dement.9(5 Suppl), S72–S83 (2013).https://doi.org/10.1016/j. jalz.2012.10.007

work page doi:10.1016/j 2013

[7] [7]

Ann Neurol.78(5), 787–800 (2015).https://doi.org/10.1002/ana.24517

Marquié, M., Normandin, M.D., Vanderburg, C.R., Costantino, I.M., Bien, E.A., Rycyna, L.G., et al.: Validating novel tau positron emission tomography tracer [F-18]-AV-1451 (T807) on postmortem brain tissue. Ann Neurol.78(5), 787–800 (2015).https://doi.org/10.1002/ana.24517

work page doi:10.1002/ana.24517 2015

[8] [8]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al.: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations (ICLR) (2021). https://doi.org/10.48550/arXiv.2010.11929

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.11929 2021

[9] [9]

Karypis, G., Kumar, V.: A Fast and High Quality Multilevel Scheme for Parti- tioning Irregular Graphs. SIAM J. Sci. Comput.20(1), 359–392 (1998).https: //doi.org/10.1137/S1064827595287997

work page doi:10.1137/s1064827595287997 1998

[10] [10]

Neurology74(3), 201–209 (2010)

Petersen, R.C., Aisen, P.S., Beckett, L.A., et al.: Alzheimer’s Disease Neuroimag- ing Initiative (ADNI): clinical characterization. Neurology74(3), 201–209 (2010). https://doi.org/10.1212/WNL.0b013e3181cb3e25

work page doi:10.1212/wnl.0b013e3181cb3e25 2010

[11] [11]

Marcus, D.S., Fotenos, A.F., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open Access Series of Imaging Studies: Longitudinal MRI Data in Nondemented and Demented Older Adults. J. Cogn. Neurosci.22(12), 2677–2684 (2010).https: //doi.org/10.1162/jocn.2009.21407

work page doi:10.1162/jocn.2009.21407 2010

[12] [12]

Alzheimers Dement.7(3), 270–279 (2011).https: //doi.org/10.1016/j.jalz.2011.03.008 10 G

Albert, M.S., DeKosky, S.T., Dickson, D., et al.: The diagnosis of mild cog- nitive impairment due to Alzheimer’s disease: Recommendations from the Na- tional Institute on Aging–Alzheimer’s Association workgroups on diagnostic guide- lines for Alzheimer’s disease. Alzheimers Dement.7(3), 270–279 (2011).https: //doi.org/10.1016/j.jalz.2011.03.008 10 G. Baek et al

work page doi:10.1016/j.jalz.2011.03.008 2011

[13] [13]

Bondi, M.W., Edmonds, E.C., Jak, A.J., et al.: Neuropsychological criteria for mild cognitive impairment improves diagnostic precision, biomarker associations, and progression rates. J. Alzheimers Dis.42(1), 275–289 (2014).https://doi. org/10.3233/JAD-140276

work page doi:10.3233/jad-140276 2014

[14] [14]

Alzheimers Dement.10(5), 511–521.e1 (2014).https://doi.org/10

Nettiksimmons, J., DeCarli, C., Landau, S., Beckett, L., Alzheimer’s Disease Neu- roimaging Initiative: Biological heterogeneity in ADNI amnestic mild cognitive impairment. Alzheimers Dement.10(5), 511–521.e1 (2014).https://doi.org/10. 1016/j.jalz.2013.09.003

2014

[15] [15]

Alzheimers Dement (Amst)10, 245–252 (2018)

Gordon, B.A., McCullough, A., Mishra, S., et al.: Cross-sectional and longitudi- nal atrophy is preferentially associated with tau rather than amyloidβpositron emission tomography pathology. Alzheimers Dement (Amst)10, 245–252 (2018). https://doi.org/10.1016/j.dadm.2018.02.003

work page doi:10.1016/j.dadm.2018.02.003 2018

[16] [16]

Neurology92(6), e601–e612 (2019).https://doi.org/10.1212/WNL.0000000000006875

Ossenkoppele, R., Smith, R., Ohlsson, T., et al.: Associations between tau, Aβ, and cortical thickness with cognition in Alzheimer disease. Neurology92(6), e601–e612 (2019).https://doi.org/10.1212/WNL.0000000000006875

work page doi:10.1212/wnl.0000000000006875 2019

[17] [17]

Alzheimers Dement.17, 1085–1096 (2021).https://doi.org/10.1002/alz.12249

Harrison, T.M., Du, R., Klencklen, G., Baker, S.L., Jagust, W.J.: Distinct effects of beta-amyloid and tau on cortical thickness in cognitively healthy older adults. Alzheimers Dement.17, 1085–1096 (2021).https://doi.org/10.1002/alz.12249

work page doi:10.1002/alz.12249 2021

[18] [18]

Radiology

Lew, C.O., Zhou, L., et al.: MRI-based Deep Learning Assessment of Amyloid, Tau, and Neurodegeneration Biomarker Status across the Alzheimer Disease Spectrum. Radiology. 309(1), e222441 (2023).https://doi.org/10.1148/radiol.222441

work page doi:10.1148/radiol.222441 2023