arxiv: 2604.24347 · v1 · submitted 2026-04-27 · 📡 eess.IV · cs.CV· cs.LG

Recognition: unknown

Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions

Alexander Effland, Marieta Toma, Michael H\"olzel, Thomas Pinetz, Yangping Li

Pith reviewed 2026-05-07 17:13 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.LG

keywords semantic segmentationhistopathologyweakly supervisedvariational optimizationlabel proportionsWasserstein distancelearned regularization

0 comments

The pith

VSLP produces dense histopathology segmentations from global label proportions using variational optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Variational Segmentation from Label Proportions (VSLP) to solve semantic segmentation in pathology images when only global tissue type proportions are known rather than expensive pixel-wise labels. It uses a two-stage process where a pre-trained transformer first generates initial pixel confidence maps via test-time augmentation, then a variational solver refines them with a Wasserstein fidelity term that enforces the given proportions plus a learned regularizer that imposes spatial structure. The resulting segmentations are more accurate than prior weakly supervised or unsupervised approaches on public datasets and an in-house collection with noisy pathologist estimates, while also allowing visualization of the energy terms for interpretability. This matters because proportions of tissue types are routinely assessed in pathology as disease markers but rarely mapped back to spatial detail.

Core claim

We introduce Variational Segmentation from Label Proportions (VSLP), a two-stage framework that infers dense segmentations from global label proportions without any pixel-level annotations. This framework first leverages a pre-trained transformer model with test-time augmentation to produce a pixel-wise confidence estimate. In the second stage, these estimates are fused by solving a variational optimization problem that incorporates a Wasserstein data fidelity term alongside a learned regularizer. Unlike end-to-end networks, our variational method can visualize the fidelity-regularization energy, resulting in more interpretable segmentation. We validate our approach on two public datasets, a

What carries the argument

The variational optimization stage that fuses initial transformer confidence maps by minimizing a combination of Wasserstein distance to the supplied global proportions and a learned regularizer that encodes spatial priors.

Load-bearing premise

The pre-trained transformer with test-time augmentation produces sufficiently accurate initial pixel-wise confidence estimates so that the learned regularizer can resolve the many possible spatial arrangements consistent with the same global proportions.

What would settle it

On a held-out histopathology dataset that supplies both global proportions and independent pixel-level ground truth, the full VSLP pipeline fails to produce higher Dice scores than the initial transformer confidence map alone or than standard weakly-supervised baselines.

Figures

Figures reproduced from arXiv: 2604.24347 by Alexander Effland, Marieta Toma, Michael H\"olzel, Thomas Pinetz, Yangping Li.

**Figure 1.** Figure 1: Semantic segmentation masks are generated by VSLP for histopathology images using a two-stage approach. Initially, view at source ↗

**Figure 2.** Figure 2: Visualization of the iterative optimization scheme for view at source ↗

**Figure 3.** Figure 3: Comparison of our method with the state-of-the-art approaches on samples from the RINGS [ view at source ↗

**Figure 4.** Figure 4: Comparison of our method with the state-of-the-art approaches on samples from the BCSS [ view at source ↗

**Figure 5.** Figure 5: Comparison of our method with LLP-adapted SA-MIL and C2C on the in-house dataset. Red indicates high branching, view at source ↗

read the original abstract

In pathology, the spatial distribution and proportions of tissue types are key indicators of disease progression, and are more readily available than fine-grained annotations. However, these assessments are rarely mapped to pixel-wise segmentation. The task is fundamentally underdetermined, as many spatially distinct segmentations can satisfy the same global proportions in the absence of pixel-wise constraints. To address this, we introduce Variational Segmentation from Label Proportions (VSLP), a two-stage framework that infers dense segmentations from global label proportions, without any pixel-level annotations. This framework first leverages a pre-trained transformer model with test-time augmentation to produce a pixel-wise confidence estimate. In the second stage, these estimates are fused by solving a variational optimization problem that incorporates a Wasserstein data fidelity term alongside a learned regularizer. Unlike end-to-end networks, our variational method can visualize the fidelity-regularization energy, resulting in more interpretable segmentation. We validate our approach on two public datasets, achieving superior performance over existing weakly supervised and unsupervised methods. For one of these datasets, proportions have been estimated by an experienced pathologist to provide a realistic benchmark to the community. Furthermore, the method scales to an in-house dataset with noisy pathologist labels, severely outperforming state-of-the-art methods, thereby demonstrating practical applicability. The code and data will be made publicly available upon acceptance at https://github.com/xiaoliangpi/VSLP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VSLP combines a transformer confidence map with variational Wasserstein optimization and a learned regularizer to produce dense segmentations from global proportions only, but the abstract gives no numbers or training details so the practical gain is still unclear.

read the letter

The paper's core move is to treat dense segmentation as an underdetermined inverse problem and solve it in two stages: first a pre-trained transformer with test-time augmentation for pixel-wise confidence, then a variational problem that matches global proportions via Wasserstein distance while applying a learned regularizer. That combination, plus the ability to inspect the energy terms, is what they call VSLP. They also release a dataset where proportions were scored by a pathologist, which is a useful addition for the community, and they plan to open the code.

Referee Report

3 major / 1 minor

Summary. The manuscript introduces Variational Segmentation from Label Proportions (VSLP), a two-stage framework for dense semantic segmentation in histopathology images using only global label proportions and no pixel-level annotations. Stage one employs a pre-trained transformer with test-time augmentation to generate initial pixel-wise confidence maps. Stage two fuses these via variational optimization incorporating a Wasserstein data fidelity term and a learned regularizer. The approach is claimed to achieve superior performance over weakly supervised and unsupervised baselines on two public datasets (one with pathologist-estimated proportions) and one in-house dataset with noisy labels, while offering interpretability through energy visualization; code and data release is promised.

Significance. If the quantitative claims hold after verification, the work would be significant for computational pathology by addressing the annotation bottleneck through global proportions, which are more readily available than dense labels. The variational formulation's interpretability is a clear strength over end-to-end networks, and successful scaling to noisy real-world data supports practical utility in clinical workflows.

major comments (3)

[Abstract] Abstract: The central claim of 'superior performance' over existing methods on two public datasets is asserted without any quantitative metrics, tables, error bars, or specific numerical comparisons, rendering the claim unverifiable from the provided text and load-bearing for acceptance.
[Method] Method section (variational stage description): No details are supplied on the learned regularizer's architecture, training objective, dataset (including any separation from test images), or optimization procedure. This is critical because the regularizer must disambiguate spatial layouts consistent with global proportions; without these, the risk of circular fitting or implicit dense supervision cannot be assessed.
[Experiments] Experiments section: Absence of ablation studies isolating the learned regularizer's contribution versus the initial transformer + TTA estimates, and no description of how global proportions were obtained or validated for the public datasets, undermines evaluation of the two-stage framework's necessity and robustness.

minor comments (1)

[Abstract] The GitHub link is given but code is not yet available; this is standard for arXiv submissions but should be confirmed in the final version.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important areas for improving clarity and verifiability. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of 'superior performance' over existing methods on two public datasets is asserted without any quantitative metrics, tables, error bars, or specific numerical comparisons, rendering the claim unverifiable from the provided text and load-bearing for acceptance.

Authors: We agree that the abstract should include quantitative support for the performance claims. In the revised manuscript, we will expand the abstract to report key metrics (e.g., mIoU and Dice scores) with direct numerical comparisons to the weakly supervised and unsupervised baselines, including standard deviations from repeated runs where applicable. revision: yes
Referee: [Method] Method section (variational stage description): No details are supplied on the learned regularizer's architecture, training objective, dataset (including any separation from test images), or optimization procedure. This is critical because the regularizer must disambiguate spatial layouts consistent with global proportions; without these, the risk of circular fitting or implicit dense supervision cannot be assessed.

Authors: We will substantially expand the Method section to provide the missing details. The learned regularizer is a lightweight convolutional network whose architecture, training objective (a self-supervised loss encouraging spatial consistency while matching global proportions on synthetic layouts), training dataset (generated synthetically from proportion statistics on a held-out collection of histopathology images with no overlap to any test images), and optimization procedure (gradient-based minimization with explicit hyperparameters) will be described explicitly. This training uses only global proportions and synthetic data, with no access to pixel-level labels on the evaluation images, thereby avoiding circular fitting or implicit dense supervision. revision: yes
Referee: [Experiments] Experiments section: Absence of ablation studies isolating the learned regularizer's contribution versus the initial transformer + TTA estimates, and no description of how global proportions were obtained or validated for the public datasets, undermines evaluation of the two-stage framework's necessity and robustness.

Authors: We will add ablation experiments that isolate the contribution of the variational stage (learned regularizer + Wasserstein fidelity) by comparing full VSLP against the initial transformer + TTA confidence maps alone. We will also clarify the provenance of global proportions: for the dataset with pathologist estimates, these are used directly as provided; for the second public dataset, proportions are obtained by summing the available (but unused) pixel annotations into global counts only. We will include a short validation subsection describing how these proportions were cross-checked for consistency. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper presents VSLP as a two-stage method: a pre-trained transformer with test-time augmentation generates initial pixel-wise confidence maps, followed by variational optimization using a Wasserstein fidelity term and a learned regularizer to produce dense segmentations from global proportions alone. No equations, self-citations, or definitional steps in the abstract or description reduce the output by construction to the inputs. The regularizer is introduced as learned without any indication that its training objective or data directly encodes the target segmentations or global proportions in a self-referential manner. The framework is self-contained against external benchmarks with no load-bearing self-citation chains or fitted predictions renamed as independent results.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the utility of a pre-trained transformer for initial estimates and the effectiveness of a learned regularizer whose training procedure is unspecified in the abstract.

free parameters (1)

parameters of the learned regularizer
Regularizer is explicitly learned, implying parameters fitted to data during the second stage.

axioms (2)

standard math Wasserstein distance serves as an appropriate data fidelity term for proportion matching
Invoked directly in the variational optimization problem described in the abstract.
domain assumption Pre-trained transformer with test-time augmentation yields useful pixel-wise confidence estimates
First stage of the framework relies on this without further justification in the abstract.

pith-pipeline@v0.9.0 · 5567 in / 1400 out tokens · 82574 ms · 2026-05-07T17:13:44.794012+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 42 canonical work pages · 2 internal anchors

[1]

A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction.ArXiv, 2024

Meixu Chen, Kai Wang, Payal Kapur, James Brugaro- las, Raquibul Hannan, and Jing Wang. A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction.ArXiv, 2024. doi: 10. 48550/arXiv.2412.07136. URLhttps://arxiv.org/ abs/2412.07136

work page arXiv 2024
[2]

Histopathology based ai model predicts anti-angiogenic therapy response in renal cancer clinical trial.Na- ture communications, 16(1):2610, 2025

Jay Jasti, Hua Zhong, Vandana Panwar, Vipul Jarmale, Jeffrey Miyata, Deyssy Carrillo, Alana Christie, Dinesh Rakheja, Zora Modrusan, Edward Ernest Kadel III, et al. Histopathology based ai model predicts anti-angiogenic therapy response in renal cancer clinical trial.Na- ture communications, 16(1):2610, 2025. doi: 10.1038/ s41467-025-57717-6

2025
[3]

Spatial transcriptomics inferred from pathology whole-slide images links tu- mor heterogeneity to survival in breast and lung can- cer.Scientific reports, 10(1):18802, 2020

Alona Levy-Jurgenson, Xavier Tekpli, Vessela N Kris- tensen, and Zohar Yakhini. Spatial transcriptomics inferred from pathology whole-slide images links tu- mor heterogeneity to survival in breast and lung can- cer.Scientific reports, 10(1):18802, 2020. doi: 10.1038/ s41598-020-75708-z

2020
[4]

Co- training for demographic classification using deep learn- ing from label proportions

Ehsan Mohammady Ardehaly and Aron Culotta. Co- training for demographic classification using deep learn- ing from label proportions. InICDMW, pages 1017–1024,
[5]

doi: 10.1109/icdmw.2017.144

work page doi:10.1109/icdmw.2017.144 2017
[6]

Proportion estimation by masked learning from label proportion

Takumi Okuo, Kazuya Nishimura, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, and Ryoma Bise. Proportion estimation by masked learning from label proportion. In MICCAI, pages 117–126. Springer, 2023. doi: 10.1007/ 978-3-031-58171-7_12

2023
[7]

Selvaraju, Michael Cogswell, Ab- hishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra

Lyndon Chan, Mahdi S Hosseini, Corwyn Rowsell, Kon- stantinos N Plataniotis, and Savvas Damaskinos. His- tosegnet: Semantic segmentation of histological tissue type in whole slide images. InCVPR, pages 10662– 10671, 2019. doi: 10.1109/ICCV .2019.01076

work page doi:10.1109/iccv 2019
[8]

Online easy example mining for weakly- supervised gland segmentation from histology images

Yi Li, Yiduo Yu, Yiwen Zou, Tianqi Xiang, and Xi- aomeng Li. Online easy example mining for weakly- supervised gland segmentation from histology images. In MICCAI, pages 578–587. Springer, 2022. doi: 10.1007/ 978-3-031-16440-8_55

2022
[9]

Weakly-supervised semantic segmentation for histopathology images based on dataset synthesis and feature consistency constraint

Zijie Fang, Yang Chen, Yifeng Wang, Zhi Wang, Xi- angyang Ji, and Yongbing Zhang. Weakly-supervised semantic segmentation for histopathology images based on dataset synthesis and feature consistency constraint. InAAAI conference on artificial intelligence, volume 37, pages 606–613, 2023. doi: 10.1609/aaai.v37i1.25136

work page doi:10.1609/aaai.v37i1.25136 2023
[10]

Scribble-supervised meibomian glands segmentation in infrared images.TOMM, 18(3):1–23, 2022

Xiaoming Liu, Shuo Wang, Ying Zhang, and Quan Yuan. Scribble-supervised meibomian glands segmentation in infrared images.TOMM, 18(3):1–23, 2022. doi: 10.1145/ 3497747

2022
[11]

Learning pixel-level se- mantic affinity with image-level supervision for weakly supervised semantic segmentation

Jiwoon Ahn and Suha Kwak. Learning pixel-level se- mantic affinity with image-level supervision for weakly supervised semantic segmentation. InCVPR, pages 4981– 4990, 2018

2018
[12]

Yi Li, Yiqun Duan, Zhanghui Kuang, Yimin Chen, Wayne Zhang, and Xiaomeng Li. Uncertainty estimation via response scaling for pseudo-mask noise mitigation in Preprint– SemanticSegmentation forHistopathology usingLearnedRegularization based onGlobalProportions11 weakly-supervised semantic segmentation. InAAAI, vol- ume 36, pages 1447–1455, 2022

2022
[13]

Bailey, Walter F

Xueliang Zhang, Qi Su, Pengfeng Xiao, Wenye Wang, Zhenshi Li, and Guangjun He. Flipcam: A feature- level flipping augmentation method for weakly supervised building extraction from high-resolution remote sensing imagery.TGRS, 62:1–17, 2024. doi: 10.1109/TGRS. 2024.3360276

work page doi:10.1109/tgrs 2024
[14]

Joint fully convolutional and graph convolutional networks for weakly-supervised segmentation of pathology images

Jun Zhang, Zhiyuan Hua, Kezhou Yan, Kuan Tian, Jian- hua Yao, Eryun Liu, Mingxia Liu, and Xiao Han. Joint fully convolutional and graph convolutional networks for weakly-supervised segmentation of pathology images. Medical image analysis, 73:102183, 2021. doi: 10.1016/ j.media.2021.102183

work page arXiv 2021
[15]

Global-and-local context network for semantic segmentation of street view images

Chih-Yang Lin, Yi-Cheng Chiu, Hui-Fuang Ng, Timo- thy K Shih, and Kuan-Hung Lin. Global-and-local context network for semantic segmentation of street view images. Sensors, 20(10):2907, 2020

2020
[16]

ImageNet: A large-scale hierarchical im age database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical im- age database. InCVPR, pages 248–255, 2009. doi: 10.1109/CVPR.2009.5206848

work page doi:10.1109/cvpr.2009.5206848 2009
[17]

https://arxiv.org/abs/2103.14030

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InCVPR, pages 10012–10022, 2021. URL https://arxiv.org/abs/2103.14030

work page arXiv 2021
[18]

1992 , issn =

Leonid I Rudin, Stanley Osher, and Emad Fatemi. Nonlin- ear total variation based noise removal algorithms.Phys- ica D: nonlinear phenomena, 60(1-4):259–268, 1992. doi: 10.1016/0167-2789(92)90242-F

work page doi:10.1016/0167-2789(92)90242-f 1992
[19]

A unified variational seg- mentation framework with a level-set based sparse com- posite shape prior.Physics in Medicine&Biology, 60(5): 1865, 2015

Wenyang Liu and Dan Ruan. A unified variational seg- mentation framework with a level-set based sparse com- posite shape prior.Physics in Medicine&Biology, 60(5): 1865, 2015

2015
[20]

Chambolle and T

Antonin Chambolle and Thomas Pock. An introduction to continuous optimization for imaging.Acta Numerica, 25:161–319, 2016. doi: 10.1017/S096249291600009X

work page doi:10.1017/s096249291600009x 2016
[21]

Total deep variation: A stable regulariza- tion method for inverse problems.IEEE PAMI, 44(12): 9163–9180, 2021

Erich Kobler, Alexander Effland, Karl Kunisch, and Thomas Pock. Total deep variation: A stable regulariza- tion method for inverse problems.IEEE PAMI, 44(12): 9163–9180, 2021. doi: 10.1109/TPAMI.2021.3124086

work page doi:10.1109/tpami.2021.3124086 2021
[22]

Learning weakly convex regularizers for convergent image-reconstruction algorithms.SIAM Journal on Imaging Sciences, 17(1):91–115, 2024

Alexis Goujon, Sebastian Neumayer, and Michael Unser. Learning weakly convex regularizers for convergent image-reconstruction algorithms.SIAM Journal on Imaging Sciences, 17(1):91–115, 2024. doi: 10.1137/ 23M1565243. URLhttps://arxiv.org/abs/2308. 10542

2024
[23]

Total deep variation for linear inverse problems

Erich Kobler, Alexander Effland, Karl Kunisch, and Thomas Pock. Total deep variation for linear inverse problems. InCVPR, pages 7549–7558, 2020. doi: 10.48550/arXiv.2001.05005

work page doi:10.48550/arxiv.2001.05005 2020
[24]

Blind single image super-resolution via it- erated shared prior learning

Thomas Pinetz, Erich Kobler, Thomas Pock, and Alexan- der Effland. Blind single image super-resolution via it- erated shared prior learning. InGCPR, pages 151–165. Springer, 2022. doi: 10.1007/978-3-031-16788-1_10

work page doi:10.1007/978-3-031-16788-1_10 2022
[25]

Equivariant boot- strapping for uncertainty quantification in imaging inverse problems.ArXiv, 2023

Julian Tachella and Marcelo Pereyra. Equivariant boot- strapping for uncertainty quantification in imaging inverse problems.ArXiv, 2023. doi: 10.48550/arXiv.2310.11838. URLhttps://arxiv.org/abs/2310.11838

work page doi:10.48550/arxiv.2310.11838 2023
[26]

Dalalyan

Arnak S. Dalalyan. Theoretical guarantees for approx- imate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society Series B: Statisti- cal Methodology, 79(3):651–676, 04 2016. ISSN 1369-

2016
[27]

doi: 10.1111/rssb.12183

work page doi:10.1111/rssb.12183
[28]

Multi- task deep learning for medical image computing and analy- sis: A review

Xiaoming Liu, Qi Liu, Ying Zhang, Man Wang, and Jin- shan Tang. Tssk-net: Weakly supervised biomarker local- ization and segmentation with image-level annotation in retinal oct images.Computers in Biology and Medicine, 153:106467, 2023. doi: 10.1016/j.compbiomed.2022. 106467

work page doi:10.1016/j.compbiomed.2022 2023
[29]

Weakly supervised deep nuclei segmentation with sparsely an- notated bounding boxes for dna image cytometry

Yixiong Liang, Zhihua Yin, Haotian Liu, Hailong Zeng, Jianxin Wang, Jianfeng Liu, and Nanying Che. Weakly supervised deep nuclei segmentation with sparsely an- notated bounding boxes for dna image cytometry. IEEE/ACM transactions on computational biology and bioinformatics, 20(1):785–795, 2021. doi: 10.1109/ TCBB.2021.3138189

work page arXiv 2021
[30]

Weakly supervised segmenta- tion with point annotations for histopathology images via contrast-based variational model

Hongrun Zhang, Liam Burrows, Yanda Meng, De- clan Sculthorpe, Abhik Mukherjee, Sarah E Coupland, Ke Chen, and Yalin Zheng. Weakly supervised segmenta- tion with point annotations for histopathology images via contrast-based variational model. InICCV, pages 15630– 15640, 2023. URLhttps://arxiv.org/abs/2304. 03572

2023
[31]

A comprehensive analysis of weakly- supervised semantic segmentation in different image do- mains.IJCV, 129(2):361–384, 2021

Lyndon Chan, Mahdi S Hosseini, and Konstantinos N Plataniotis. A comprehensive analysis of weakly- supervised semantic segmentation in different image do- mains.IJCV, 129(2):361–384, 2021. doi: 10.1007/ s11263-020-01373-4

2021
[32]

Cluster-to-conquer: A framework for end-to-end multi- instance learning for whole slide image classification

Yash Sharma, Aman Shrivastava, Lubaina Ehsan, Christo- pher A Moskaluk, Sana Syed, and Donald Brown. Cluster-to-conquer: A framework for end-to-end multi- instance learning for whole slide image classification. InMIDL, pages 682–698. PMLR, 2021. URLhttps: //arxiv.org/abs/2103.10626

work page arXiv 2021
[33]

Detection of breast cancer from whole slide histopathological images using deep multiple in- stance cnn.IEEE Access, 8:213502–213511, 2020

Kausik Das, Sailesh Conjeti, Jyotirmoy Chatterjee, and Debdoot Sheet. Detection of breast cancer from whole slide histopathological images using deep multiple in- stance cnn.IEEE Access, 8:213502–213511, 2020. doi: 10.1109/ACCESS.2020.3040106

work page doi:10.1109/access.2020.3040106 2020
[34]

Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch- level classification labels.Medical Image Analysis, 80: 102487, 2022

Chu Han, Jiatai Lin, Jinhai Mai, Yi Wang, Qingling Zhang, Bingchao Zhao, Xin Chen, Xipeng Pan, Zhenwei Shi, Zeyan Xu, et al. Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch- level classification labels.Medical Image Analysis, 80: 102487, 2022. doi: 10.1016/j.media.2022.102487

work page doi:10.1016/j.media.2022.102487 2022
[35]

Weakly supervised histopathology image segmen- tation with self-attention.Medical Image Analysis, 86: 102791, 2023

Kailu Li, Ziniu Qian, Yingnan Han, I Eric, Chao Chang, Bingzheng Wei, Maode Lai, Jing Liao, Yubo Fan, and Yan Xu. Weakly supervised histopathology image segmen- tation with self-attention.Medical Image Analysis, 86: 102791, 2023. doi: https://doi.org/10.1016/j.media.2023. 102791. Preprint– SemanticSegmentation forHistopathology usingLearnedRegularization ...

work page doi:10.1016/j.media.2023 2023
[36]

Svm for learning with label propor- tions

Felix Yu, Dong Liu, Sanjiv Kumar, Jebara Tony, and Shih-Fu Chang. Svm for learning with label propor- tions. InICML, pages 504–512. PMLR, 2013. URL https://arxiv.org/abs/1306.0886

work page arXiv 2013
[37]

A two- stage training framework with feature-label matching mechanism for learning from label proportions

Haoran Yang, Wanjing Zhang, and Wai Lam. A two- stage training framework with feature-label matching mechanism for learning from label proportions. In Asian Conference on Machine Learning, pages 1461–
[38]

URLhttps://proceedings.mlr

PMLR, 2021. URLhttps://proceedings.mlr. press/v157/yang21b.html

2021
[39]

Large-Scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword- to-Caption Augmentation

Shinnosuke Matsuo, Ryoma Bise, Seiichi Uchida, and Daiki Suehiro. Learning from label proportion with online pseudo-label decision by regret minimization. InICASSP, pages 1–5. IEEE, 2023. doi: 10.1109/ICASSP49357. 2023.10097069

work page doi:10.1109/icassp49357 2023
[40]

Manning, Stefano Ermon, and Chelsea Finn

Liyao Tang, Zhe Chen, Shanshan Zhao, Chaoyue Wang, and Dacheng Tao. All points matter: entropy-regularized distribution alignment for weakly-supervised 3d segmen- tation.NeuRIPS, 36, 2024. doi: 10.5555/3666122. 3669562

work page doi:10.5555/3666122 2024
[41]

Negative pseudo la- beling using class proportion for semantic segmentation in pathology

Hiroki Tokunaga, Brian Kenji Iwana, Yuki Teramoto, Ak- ihiko Yoshizawa, and Ryoma Bise. Negative pseudo la- beling using class proportion for semantic segmentation in pathology. InECCV, pages 430–446. Springer, 2020. doi: 10.1007/978-3-030-58555-6_26

work page doi:10.1007/978-3-030-58555-6_26 2020
[42]

Learning from partial label proportions for whole slide image segmentation

Shinnosuke Matsuo, Daiki Suehiro, Seiichi Uchida, Hi- roaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, and Ry- oma Bise. Learning from partial label proportions for whole slide image segmentation. InMICCAI, pages 372–
[43]

doi: 10.48550/arXiv.2405.09041

Springer, 2024. doi: 10.48550/arXiv.2405.09041

work page doi:10.48550/arxiv.2405.09041 2024
[44]

A first-order primal-dual algorithm for convex problems with appli- cations to imaging.J

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with appli- cations to imaging.J. Math. Imaging Vis., 40:120–145,
[45]

doi: 10.1007/s10851-010-0251-1

work page doi:10.1007/s10851-010-0251-1
[46]

Shared prior learning of energy-based mod- els for image reconstruction.SIAM J

Thomas Pinetz, Erich Kobler, Thomas Pock, and Alexan- der Effland. Shared prior learning of energy-based mod- els for image reconstruction.SIAM J. Imaging Sci., 14(4): 1706–1748, 2021. doi: 10.48550/arXiv.2011.06539

work page doi:10.48550/arxiv.2011.06539 2021
[47]

Cornia, M

Kai Zhang, Luc Van Gool, and Radu Timofte. Deep un- folding network for image super-resolution. InCVPR, pages 3217–3226, 2020. doi: 10.1109/CVPR42600.2020. 00328

work page doi:10.1109/cvpr42600.2020 2020
[48]

Fista-net: Learning a fast iterative shrinkage thresholding network for inverse problems in imaging.TMI, 40(5):1329–1339,

Jinxi Xiang, Yonggui Dong, and Yunjie Yang. Fista-net: Learning a fast iterative shrinkage thresholding network for inverse problems in imaging.TMI, 40(5):1329–1339,
[49]

doi: 10.1109/TMI.2021.3054167

work page doi:10.1109/tmi.2021.3054167 2021
[50]

Total generalized variation,

Kristian Bredies, Karl Kunisch, and Thomas Pock. Total generalized variation.SIAM Journal on Imaging Sciences, 3(3):492–526, 2010. doi: 10.1137/090769521

work page doi:10.1137/090769521 2010
[51]

Springer International Pub- lishing, Cham, 2023

Sebastian Lunz.Learned Regularizers for Inverse Prob- lems, pages 1133–1153. Springer International Pub- lishing, Cham, 2023. ISBN 978-3-030-98661-2. doi: 10.1007/978-3-030-98661-2_68

work page doi:10.1007/978-3-030-98661-2_68 2023
[52]

Varia- tional methods with application to medical image segmen- tation: A survey.Neurocomputing, page 130260, 2025

Xiu Shu, Zhihui Li, Xiaojun Chang, and Di Yuan. Varia- tional methods with application to medical image segmen- tation: A survey.Neurocomputing, page 130260, 2025

2025
[53]

Learning loss for test-time augmentation.NeuRIPS, 33: 4163–4174, 2020

Ildoo Kim, Younghoon Kim, and Sungwoong Kim. Learning loss for test-time augmentation.NeuRIPS, 33: 4163–4174, 2020. doi: 10.48550/arXiv.2010.11422. URL https://arxiv.org/abs/2010.11422

work page doi:10.48550/arxiv.2010.11422 2020
[54]

Better aggregation in test-time aug- mentation

Divya Shanmugam, Davis Blalock, Guha Balakrishnan, and John Guttag. Better aggregation in test-time aug- mentation. InCVPR, pages 1214–1223, 2021. URL https://arxiv.org/abs/2011.11156

work page arXiv 2021
[55]

Springer, Basel, 2005

Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gra- dient flows: in metric spaces and in the space of proba- bility measures. Springer, Basel, 2005. doi: 10.1007/ 978-3-7643-8722-8

2005
[56]

Cédric Villani.Optimal transport: old and new, volume
[57]

Optimal transport , volume 338 of Grundlehren der math- ematischen Wissenschaften [Fundamental Principles of Mathematical Sci- ences]

Springer, Berlin, 2009. URLhttps://doi.org/ 10.1007/978-3-540-71050-9

work page doi:10.1007/978-3-540-71050-9 2009
[58]

f-gan: Training generative neural samplers using varia- tional divergence minimization.NeuRIPS, 29, 2016

Sebastian Nowozin, Botond Cseke, and Ryota Tomioka. f-gan: Training generative neural samplers using varia- tional divergence minimization.NeuRIPS, 29, 2016. doi: 10.48550/arXiv.1606.00709. URLhttps://arxiv. org/abs/1606.00709

work page doi:10.48550/arxiv.1606.00709 2016
[59]

Wasserstein GAN

Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein generative adversarial networks. InICML, pages 214–223. PMLR, 2017. URLhttps://arxiv. org/abs/1701.07875

work page Pith review arXiv 2017
[60]

U-Net: Convolutional Networks for Biomedical Image Segmentation, pp.\ 234–241

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image seg- mentation. InMICCAI, pages 234–241. Springer, 2015. doi: 10.1007/978-3-319-24574-4_28

work page doi:10.1007/978-3-319-24574-4_28 2015
[61]

Bayesian learning via stochastic gradient langevin dynamics

Max Welling and Yee W Teh. Bayesian learning via stochastic gradient langevin dynamics. InICML, pages 681–688, Madison, WI, USA, 2011. Omnipress

2011
[62]

Improved denois- ing diffusion probabilistic models.arXiv preprint arXiv:2102.09672,

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InICML, pages 8162–8171. PMLR, 2021. URLhttps://arxiv.org/ abs/2102.09672

work page arXiv 2021
[63]

A hybrid deep learning approach for gland segmentation in prostate histopathological images

Massimo Salvi, Martino Bosco, Luca Molinaro, Alessan- dro Gambella, Mauro Papotti, U Rajendra Acharya, and Filippo Molinari. A hybrid deep learning approach for gland segmentation in prostate histopathological images. Artificial Intelligence in Medicine, 115:102076, 2021. doi: 10.1016/j.artmed.2021.102076

work page doi:10.1016/j.artmed.2021.102076 2021
[64]

Structured crowdsourcing enables convolutional segmentation of histology images

Mohamed Amgad, Habiba Elfandy, Hagar Hussein, Lamees A Atteya, Mai AT Elsebaie, Lamia S Abo El- nasr, Rokia A Sakr, Hazem SE Salem, Ahmed F Is- mail, Anas M Saad, et al. Structured crowdsourcing enables convolutional segmentation of histology images. Bioinformatics, 35(18):3461–3467, 2019. doi: 10.1093/ bioinformatics/btz083

2019
[65]

Language Model Cascades: Token-Level Uncertainty and Beyond

Ilya Loshchilov and Frank Hutter. Decoupled weight de- cay regularization. InICLR, 2019. doi: 10.48550/arXiv. 1711.05101

work page internal anchor Pith review doi:10.48550/arxiv 2019
[66]

Adam: A Method for Stochastic Optimization

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InICLR, 2015. doi: 10.48550/ arXiv.1412.6980

work page internal anchor Pith review arXiv 2015
[67]

Segment anything in medical images.Na- Preprint– SemanticSegmentation forHistopathology usingLearnedRegularization based onGlobalProportions13 ture Communications, 15(1):654, 2024

Jun Ma, Yuting He, Feifei Li, Lin Han, Chenyu You, and Bo Wang. Segment anything in medical images.Na- Preprint– SemanticSegmentation forHistopathology usingLearnedRegularization based onGlobalProportions13 ture Communications, 15(1):654, 2024. doi: 10.1038/ s41467-024-44824-z

2024
[68]

Swin-unet: Unet-like pure transformer for medical image segmenta- tion

Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xi- aopeng Zhang, Qi Tian, and Manning Wang. Swin-unet: Unet-like pure transformer for medical image segmenta- tion. InECCV, pages 205–218. Springer, 2022

2022
[69]

Transunet: Transformers make strong encoders for medical image segmentation.arXiv, 2021

Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L Yuille, and Yuyin Zhou. Transunet: Transformers make strong encoders for medical image segmentation.arXiv, 2021

2021
[70]

Generative adversarial networks for stain normalisation in histopathology

Jack Breen, Kieran Zucker, Katie Allen, Nishant Raviku- mar, and Nicolas M Orsi. Generative adversarial networks for stain normalisation in histopathology. InApplications of Generative AI, pages 227–247. Springer, Cham, 2024

2024