arxiv: 2604.23481 · v1 · submitted 2026-04-26 · 💻 cs.CV · cs.LG

Recognition: unknown

Leveraging Spatial Transcriptomics as Alternative to Manual Annotations for Deep Learning-Based Nuclei Analysis

Kazuya Nishimura , Ryoma Bise , Haruka Hirose , Yasuhiro Kojima

Authors on Pith no claims yet

Pith reviewed 2026-05-08 06:54 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords spatial transcriptomicsnuclei segmentationdeep learningpathology imagescell classificationtransferabilityhistopathologygene expression

0 comments

The pith

Spatial transcriptomics data supplies training labels that let nuclei segmentation models generalize better to unseen organs than manual annotations do.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Pathology images require precise nuclei segmentation and classification, yet creating manual pixel-level labels across varied tissues and stains is expensive and slow. This work uses spatial transcriptomics measurements, which record gene activity at cell locations within tissue sections, to generate both nuclear masks and cell-type labels directly from the same images. Gene expression profiles are turned into image-friendly cell labels through a bridging classification step designed for deep learning. When tested on organs never seen during training, the resulting models produce higher segmentation accuracy than conventional supervised models trained on manual labels from more organ types.

Core claim

The paper claims that cell-level spatial transcriptomics data aligned to histopathological images supplies both nuclear masks and gene expression profiles. These profiles are converted into cell-type labels using an image-oriented classification approach that links gene-based typing to visual features. Models trained on the resulting labels achieve higher segmentation accuracy on previously unseen organs than models trained with manual annotations from a larger set of organs, while also improving cell classification performance.

What carries the argument

The alignment of spatial transcriptomics data to histopathological images to derive nuclear masks and gene-expression cell labels, connected by an image-oriented cell-type classification method that adapts gene profiles for visual recognition.

If this is right

Nuclei analysis models can be developed for new tissues without repeating large-scale manual annotation efforts.
Segmentation performance improves on organs outside the training distribution, indicating stronger cross-tissue transfer.
Cell-type classification gains consistent accuracy lifts by using image-oriented adaptation of gene expression data.
Training sets become easier to expand across different staining protocols and tissue conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If alignment quality holds at scale, the approach could support large multi-organ datasets that reduce reliance on expert annotators in clinical settings.
The same ST-derived labels might be combined with other spatial omics layers to refine cell boundaries in densely packed tissues.
Models built this way could serve as starting points for fine-tuning on small manual sets, accelerating deployment in new pathology labs.

Load-bearing premise

Spatial transcriptomics measurements can be aligned accurately enough with histopathological images to produce reliable nuclear masks and gene expression profiles that convert into effective cell-type labels for image-based training.

What would settle it

Train a nuclei segmentation model on ST-derived labels from a limited set of organs, test it on a completely new organ, and find that its Dice score or equivalent accuracy metric falls below that of a model trained on manual annotations from multiple organs.

Figures

Figures reproduced from arXiv: 2604.23481 by Haruka Hirose, Kazuya Nishimura, Ryoma Bise, Yasuhiro Kojima.

**Figure 1.** Figure 1: Overview of proposed framework. 4 Spatial Transcriptomics–Guided Model Training An overview of the proposed framework is illustrated in view at source ↗

**Figure 2.** Figure 2: Example of generated masks. Tables 1 and 2 summarize the dataset statistics and segmentation results, respectively. While the PanNuke baseline was fully supervised across 19 organs, the proposed HEST1K model was trained on 9 organs. Despite the smaller number of slides and organs, HEST1K leverages substantially more nuclear instances per slide via DAPI-derived masks, thereby facilitating dense supervision… view at source ↗

**Figure 3.** Figure 3: Example of segmentation results. methods. As an ablation study, we also evaluated a variant without hierarchical classification of neoplastic cells (w/o Hier). To compare with conventional supervised approaches, we additionally conducted experiments on the PanNuke dataset following the standard segmentation setup view at source ↗

read the original abstract

Deep learning-based nuclei segmentation and classification in pathology images typically rely on large-scale pixel-level manual annotations, which are costly and difficult to obtain across diverse tissues and staining conditions. To address this limitation, we propose a framework that leverages spatial transcriptomics (ST) data as supervision for nuclei segmentation and classification. By incorporating cell-level ST data, we obtain gene expression profiles and corresponding nuclear masks from histopathological images. Gene expression profiles are converted into cell-type labels and used as training data for image-based classification. Because existing gene expression-based cell-type classification methods are not designed for image recognition, we introduce an image-oriented classification approach that bridges gene expression-based cell typing and image-based cell classification. To evaluate generalization, we conduct segmentation experiments on previously unseen organs and compare our method with conventional supervised models. Despite being trained on fewer organ types, our framework achieves higher segmentation accuracy, demonstrating strong transferability. Classification experiments further show consistent improvements over existing approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses spatial transcriptomics to generate nuclei labels and claims better transfer to unseen organs, but skips quantitative checks on alignment quality.

read the letter

This paper's main takeaway is that spatial transcriptomics can supply nuclear masks and cell-type labels for training deep learning models on pathology images, bypassing expensive manual annotations, and that the resulting models generalize better to new organs than standard supervised ones trained on more data. The new element is their image-oriented classification step that converts gene expression profiles into labels suitable for pixel-level image tasks rather than relying on existing gene-based typing methods. They frame the annotation bottleneck clearly and test segmentation transfer across organs, which is a practical angle if the gains hold up. The intent feels straightforward and the supervision source is independent, avoiding obvious circularity. The soft spot is the alignment and label quality. ST spots are much larger than nuclei, so turning them into precise masks and low-noise cell-type labels requires solid registration and assignment. The work does not report numbers on overlap with manual nuclei, spot-to-cell precision, or agreement with pathologist review, leaving open whether the reported accuracy edge comes from the method or from dataset specifics. The abstract also omits concrete metrics, baselines, and sample details, which makes the central claim hard to assess from the summary alone. Readers working on computational pathology or multi-modal supervision for medical imaging would get the most from this. It could give ideas for scaling training data without full manual effort. I would send it to peer review. The problem is real and the proposal has enough substance that experts should check the experimental controls and alignment validation.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a framework that uses spatial transcriptomics (ST) data to generate nuclear masks and cell-type labels as supervision for deep learning models performing nuclei segmentation and classification in histopathological images, thereby avoiding costly manual pixel-level annotations. Gene expression profiles from ST are converted to cell-type labels via a new image-oriented classification bridge. Experiments evaluate generalization by training on fewer organ types and testing segmentation on previously unseen organs, claiming higher accuracy and better transferability than conventional supervised models trained with manual annotations.

Significance. If the ST-derived supervision proves reliable, the approach could meaningfully reduce dependence on manual annotations while improving cross-tissue generalization in computational pathology. The use of independent ST measurements (rather than model-derived pseudo-labels) avoids circularity and is a clear strength. The transferability result, if substantiated with metrics, would be a useful contribution to weakly-supervised nuclei analysis.

major comments (2)

[Methods] Methods (ST alignment and label generation): The paper describes deriving nuclear masks and gene-expression-to-cell-type labels from ST data but reports no quantitative validation of alignment fidelity or label quality (e.g., Dice/IoU of derived masks vs. manual nuclei, spot-to-nucleus assignment precision, or label noise rate vs. pathologist review). This is load-bearing for the central claim because ST spots (~50-100 µm) are much larger than nuclei (~5-10 µm); any registration error directly corrupts the pixel-level supervision and could produce the reported accuracy gains through dataset artifacts rather than the proposed method.
[Results] Results (segmentation experiments): The claim that the framework 'achieves higher segmentation accuracy' on unseen organs despite training on fewer organ types is presented without numerical metrics, baseline details, sample sizes, or statistical tests. This prevents assessment of effect size, post-hoc selection, or whether the gain is statistically meaningful, directly undermining evaluation of the transferability result.

minor comments (1)

[Abstract] Abstract: The summary of results would be strengthened by including at least one key quantitative metric (e.g., Dice score improvement) to support the accuracy and transferability claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of the potential significance of our work. We address each major comment below and have revised the manuscript accordingly to strengthen the presentation and support for our claims.

read point-by-point responses

Referee: [Methods] Methods (ST alignment and label generation): The paper describes deriving nuclear masks and gene-expression-to-cell-type labels from ST data but reports no quantitative validation of alignment fidelity or label quality (e.g., Dice/IoU of derived masks vs. manual nuclei, spot-to-nucleus assignment precision, or label noise rate vs. pathologist review). This is load-bearing for the central claim because ST spots (~50-100 µm) are much larger than nuclei (~5-10 µm); any registration error directly corrupts the pixel-level supervision and could produce the reported accuracy gains through dataset artifacts rather than the proposed method.

Authors: We agree that quantitative validation of alignment fidelity and label quality is essential to substantiate the central claims, given the mismatch in resolution between ST spots and nuclei. The original manuscript provides qualitative visualizations of the derived masks and labels along with indirect support via downstream task performance. In the revised manuscript, we have added a new subsection under Methods that includes quantitative metrics: Dice and IoU scores comparing ST-derived nuclear masks to available manual annotations on a subset of images, precision/recall for spot-to-nucleus assignments, and agreement rates between ST-derived cell-type labels and pathologist review. We also include a sensitivity analysis discussing the effects of potential registration errors and how the image-oriented classification bridge helps mitigate label noise. revision: yes
Referee: [Results] Results (segmentation experiments): The claim that the framework 'achieves higher segmentation accuracy' on unseen organs despite training on fewer organ types is presented without numerical metrics, baseline details, sample sizes, or statistical tests. This prevents assessment of effect size, post-hoc selection, or whether the gain is statistically meaningful, directly undermining evaluation of the transferability result.

Authors: We acknowledge that the results section would benefit from more explicit and complete reporting to allow full evaluation of the transferability claims. The manuscript already contains the underlying experimental results (segmentation metrics on seen and unseen organs, comparisons to conventional supervised baselines), but the presentation was insufficiently detailed. In the revision, we have expanded the Results section and associated tables to explicitly report all numerical metrics (e.g., Dice, IoU, and classification accuracy), baseline model architectures and training protocols, exact sample sizes (number of organs, images, and nuclei), and statistical tests with p-values to quantify significance and effect sizes. These additions directly address concerns about post-hoc selection and statistical meaningfulness. revision: yes

Circularity Check

0 steps flagged

No significant circularity; supervision and claims are empirically grounded in independent ST data

full rationale

The paper derives training labels and masks directly from spatial transcriptomics measurements (gene expression profiles aligned to images, converted to cell-type labels) rather than from model outputs or fitted parameters. The central claims—higher segmentation accuracy on unseen organs despite fewer training organ types, plus classification improvements—are presented as empirical results from experiments comparing against conventional supervised models. No equations, self-definitional loops, fitted-input-as-prediction steps, or load-bearing self-citations appear in the provided text; the image-oriented classification bridge is introduced as a methodological choice to adapt ST-derived labels, not as a tautology. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about data alignment and label conversion; no free parameters or invented entities are specified in the abstract.

axioms (2)

domain assumption ST spots can be aligned to produce accurate nuclear masks and gene expression profiles from histopathological images
Invoked to obtain training data for segmentation and classification.
domain assumption Gene expression profiles can be converted into cell-type labels suitable for training image-based classifiers
Central to the introduced image-oriented classification approach.

pith-pipeline@v0.9.0 · 5472 in / 1340 out tokens · 51793 ms · 2026-05-08T06:54:03.434782+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 1 canonical work pages

[1]

Aibar, S., González-Blas, C.B., Moerman, T., Huynh-Thu, V.A., Imrichova, H., Hulselmans, G., Rambow, F., Marine, J.C., Geurts, P., Aerts, J., et al.: Scenic: single-cell regulatory network inference and clustering. Nat. Methods14(11), 1083– 1086 (2017)

2017
[2]

Andreatta, M., Carmona, S.J.: Ucell: Robust and scalable single-cell gene signature scoring. Comput. Struct. Biotechnol. J.19, 3796–3798 (2021)

2021
[3]

Cable, D.M., Murray, E., Zou, L.S., Goeva, A., Macosko, E.Z., Chen, F., Irizarry, R.A.: Robust decomposition of cell type mixtures in spatial transcriptomics. Nat. biotechnol.40(4), 517–526 (2022)

2022
[4]

Chen, H., Li, D., Bar-Joseph, Z.: Scs: cell segmentation for high-resolution spatial transcriptomics. Nat. Methods20(8), 1237–1243 (2023)

2023
[5]

IEEE Trans

Cheng, J., Rajapakse, J.C., et al.: Segmentation of clustered nuclei with shape markers and marking function. IEEE Trans. Biomed. Eng.56(3), 741–748 (2008)

2008
[6]

BMC bioinformatics26(1), 22 (2025)

Cheng, J., Jin, X., Smyth, G.K., Chen, Y.: Benchmarking cell type annotation methods for 10x xenium spatial transcriptomics data. BMC bioinformatics26(1), 22 (2025)

2025
[7]

bioRxiv pp

Defard, T., Blondel, A., Bellow, S., Coleon, A., de Melo, G.D., Walter, T., Mueller, F.: Rna2seg: a generalist model for cell segmentation in image-based spatial tran- scriptomics. bioRxiv pp. 2025–03 (2025)

2025
[8]

Nucleic Acids Res.49(9), e50–e50 (2021)

Elosua-Bayes, M., Nieto, P., Mereu, E., Gut, I., Heyn, H.: Spotlight: seeded nmf regression to deconvolute spatial transcriptomics spots with single-cell transcrip- tomes. Nucleic Acids Res.49(9), e50–e50 (2021)

2021
[9]

Franzén, O., Gan, L.M., Björkegren, J.L.: Panglaodb: a web server for exploration ofmouseandhumansingle-cellrnasequencingdata.Database2019,baz046(2019)

2019
[10]

Gamper, J., Alemi Koohbanani, N., Benet, K., Khuram, A., Rajpoot, N.: Pan- nuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification. In: Eur. Congr. Digit. Pathol. pp. 11–19. Springer (2019)

2019
[11]

Gamper, N

Gamper, J., Koohbanani, N.A., Benes, K., Graham, S., Jahanifar, M., Khurram, S.A., Azam, A., Hewitt, K., Rajpoot, N.: Pannuke dataset extension, insights and baselines. arXiv preprint arXiv:2003.10778 (2020)

work page arXiv 2003
[12]

In: ICCV

Graham, S., Jahanifar, M., Azam, A., Nimir, M., Tsang, Y.W., Dodd, K., Hero, E., Sahota, H., Tank, A., Benes, K., et al.: Lizard: a large-scale dataset for colonic nuclear instance segmentation and classification. In: ICCV. pp. 684–693 (2021)

2021
[13]

Graham, S., Vu, Q.D., Raza, S.E.A., Azam, A., Tsang, Y.W., Kwak, J.T., Rajpoot, N.: Hover-net: Simultaneous segmentation and classification of nuclei in multi- tissue histology images. Med. Image Anal.58, 101563 (2019)

2019
[14]

Hörst, F., Rempe, M., Heine, L., Seibold, C., Keyl, J., Baldini, G., Ugurel, S., Siveke, J., Grünwald, B., Egger, J., et al.: Cellvit: Vision transformers for precise cell segmentation and classification. Med. Image Anal.94, 103143 (2024)

2024
[15]

Nucleic Acids Res.51(D1), D870–D876 (2023)

Hu, C., Li, T., Xu, Y., Zhang, X., Li, F., Bai, J., Chen, J., Jiang, W., Yang, K., Ou, Q., et al.: Cellmarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scrna-seq data. Nucleic Acids Res.51(D1), D870–D876 (2023)

2023
[16]

bioRxiv pp

Janesick, A., Kravitz, S.N., Stauffer, W., Valencia, M., Taylor, S.E.: Biomarker quantification in breast cancer using xenium in situ. bioRxiv pp. 2025–12 (2025)

2025
[17]

Janesick, A., Shelansky, R., Gottscho, A.D., Wagner, F., Williams, S.R., Rouault, M., Beliakoff, G., Morrison, C.A., Oliveira, M.F., Sicherman, J.T., et al.: High resolution mapping of the tumor microenvironment using integrated single-cell, spatial and in situ analysis. Nat. Commun.14(1), 8353 (2023) 10 K. Nishimura et al

2023
[18]

NeurIPS37, 53798–53833 (2024)

Jaume, G., Doucet, P., Song, A., Lu, M.Y., Almagro Pérez, C., Wagner, S., Vaidya, A., Chen, R., Williamson, D., Kim, A., et al.: Hest-1k: A dataset for spatial tran- scriptomics and histology image analysis. NeurIPS37, 53798–53833 (2024)

2024
[19]

Kleshchevnikov, V., Shmatko, A., Dann, E., Aivazidis, A., King, H.W., Li, T., Elmentaite, R., Lomakin, A., Kedlian, V., Gayoso, A., et al.: Cell2location maps fine-grained cell types in spatial transcriptomics. Nat. biotechnol.40(5), 661–671 (2022)

2022
[20]

IEEE transactions on medical imaging39(5), 1380–1391 (2019)

Kumar, N., Verma, R., Anand, D., Zhou, Y., Onder, O.F., Tsougenis, E., Chen, H., Heng, P.A., Li, J., Hu, Z., et al.: A multi-organ nucleus segmentation challenge. IEEE transactions on medical imaging39(5), 1380–1391 (2019)

2019
[21]

Petukhov, V., Xu, R.J., Soldatov, R.A., Cadinu, P., Khodosevich, K., Moffitt, J.R., Kharchenko, P.V.: Cell segmentation in imaging-based spatial transcriptomics. Nat. biotechnol.40(3), 345–354 (2022)

2022
[22]

GigaScience 14, giaf011 (2025)

Schuiveling, M., Liu, H., Eek, D., Breimer, G.E., Suijkerbuijk, K.P., Blokx, W.A., Veta, M.: A novel dataset for nuclei and tissue segmentation in melanoma with baseline nuclei segmentation and tissue segmentation benchmarks. GigaScience 14, giaf011 (2025)

2025
[23]

Science352(6282), 189– 196 (2016)

Tirosh, I., Izar, B., Prakadan, S.M., Wadsworth, M.H., Treacy, D., Trombetta, J.J., Rotem, A., Rodman, C., Lian, C., Murphy, G., et al.: Dissecting the multicellular ecosystem of metastatic melanoma by single-cell rna-seq. Science352(6282), 189– 196 (2016)

2016
[24]

Scientific reports9(1), 5233 (2019)

Traag, V.A., Waltman, L., Van Eck, N.J.: From louvain to leiden: guaranteeing well-connected communities. Scientific reports9(1), 5233 (2019)

2019
[25]

PloS one8(7), e70221 (2013)

Veta, M., Van Diest, P.J., Kornegoor, R., Huisman, A., Viergever, M.A., Pluim, J.P.: Automatic nuclei segmentation in h&e stained breast cancer histopathology images. PloS one8(7), e70221 (2013)

2013
[26]

Genome Biol.24(1), 235 (2023)

Wang, Y., Wang, W., Liu, D., Hou, W., Zhou, T., Ji, Z.: Genesegnet: a deep learn- ing framework for cell segmentation by integrating gene expression and imaging. Genome Biol.24(1), 235 (2023)

2023
[27]

Nucleic Acids Res

Yuan, H., Yan, M., Zhang, G., Liu, W., Deng, C., Liao, G., Xu, L., Luo, T., Yan, H., Long, Z., et al.: Cancersea: a cancer single-cell state atlas. Nucleic Acids Res. 47(D1), D900–D908 (2019)

2019
[28]

Zhang, A.W., O’Flanagan, C., Chavez, E.A., Lim, J.L., Ceglia, N., McPherson, A., Wiens, M., Walters, P., Chan, T., Hewitson, B., et al.: Probabilistic cell-type as- signment of single-cell rna-seq for tumor microenvironment profiling. Nat. Methods 16(10), 1007–1015 (2019)

2019