pith. sign in

arxiv: 2606.18876 · v2 · pith:SGGOXLDSnew · submitted 2026-06-17 · 💻 cs.CV · cs.LG

Test-Time Adaptation in Optical Coherence Tomography Using Trajectory-Aligned Time-Independent Flow

Pith reviewed 2026-06-26 21:28 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords test-time adaptationoptical coherence tomographyflow matchingage-related macular degenerationdomain adaptationimage denoisingbiomarker segmentation
0
0 comments X

The pith

A flow-matching method aligns noisy OCT test images to training distributions by histogram-matching to synthetic trajectories and removing time conditioning, enabling state-of-the-art AMD biomarker segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a test-time adaptation approach for optical coherence tomography that generates high-quality surrogate images from noisy inputs using flow matching. Domain gaps are handled by matching each test image histogram to synthetic reference trajectories and by stripping time conditioning from the network to tolerate real-world noise variations. The resulting aligned inputs allow a fixed segmentation model to reach state-of-the-art performance on critical biomarkers for two stages of age-related macular degeneration.

Core claim

By matching the histogram of a test OCT image to synthetic reference trajectories inside a flow-matching process and by removing the network's time conditioning, the input distribution is brought into alignment with the training distribution, allowing a pre-trained model to produce accurate segmentations of AMD biomarkers without any retraining or fine-tuning.

What carries the argument

Trajectory-aligned time-independent flow, which performs histogram matching of test images to synthetic reference trajectories while dropping explicit time conditioning in the denoising network.

If this is right

  • A single pre-trained segmentation network can be deployed across OCT devices of varying quality without retraining.
  • Biomarker segmentation accuracy improves on both early and advanced AMD cases when test images are passed through the adapted flow.
  • The same histogram-matching and time-independent mechanism can be inserted into other flow-based or diffusion-based medical image pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may extend to other noisy imaging modalities such as ultrasound or low-dose CT where synthetic trajectory references can be generated.
  • Real-time clinical workflows could use the method to standardize incoming scans from different scanners before automated analysis.
  • If synthetic trajectories prove too expensive to generate, simpler statistical matching rules might be derived as a lighter alternative.

Load-bearing premise

That matching a test image histogram to synthetic reference trajectories will bring the input distribution close enough to training data to overcome the domain gap.

What would settle it

Segmenting AMD biomarkers on a held-out set of low-quality OCT scans with and without the histogram-matching step; if accuracy does not rise when the step is added, the alignment claim is false.

Figures

Figures reproduced from arXiv: 2606.18876 by Gregor Reiter, Hrvoje Bogunovi\'c, Thomas Pinetz, Ursula Schmidt-Erfurth, Veit Hucke.

Figure 1
Figure 1. Figure 1: Schematic overview of the TTA-Flow framework. The process begins by (a) generating reference trajectories and (b, c) calculating an average histogram H¯s for each time step. During inference, (d) an incoming test sample ζ is matched to the cor￾responding reference histogram before being processed by the flow matching network. Finally, (e) the reconstructed sample zS is utilized for downstream evaluation. 2… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison to the state-of-the-art on downstream fluid and GA segmenta￾tion on the Cirrus (first two rows) and Topcon (remaining rows) data. For the GA segmentation, a white horizontal line denotes the position of the corresponding bscan (In-house, top row). Note that only every second bscan is annotated in the GA GT. respectively. On the Cirrus data, our unconditional method achieves a state-of￾the-art me… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of downstream lesion segmentation performance (DSC) for differ￾ent reference time points starget. For each target, the results of the conditional and unconditional networks are shown. *denotes statistical significance. fluid segmentations that align well with the ground-truth annotations. Similarly, for the GA task ( [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Optical coherence tomography (OCT) is essential in ophthalmology, but inconsistent image quality especially in low-cost devices hinders automated analysis. To address this, we introduce a flow-matching-based test-time adaptation method that generates high-quality surrogate images from noisy inputs. Typically, domain gaps between test and training data cause pixel distribution mismatches during the denoising process. We overcome this by matching the test image's histogram to synthetic reference trajectories, successfully aligning the input with expected distributions. Additionally, we remove the network's time conditioning to account for slight deviations in real-world noise distributions. Our approach achieves state-of-the-art performance in segmenting critical biomarkers for two stages of Age-related Macular Degeneration (AMD). Code is available: https://github.com/Veit21/tta-flow.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a flow-matching-based test-time adaptation method for OCT images that generates surrogate high-quality images from noisy inputs. It addresses domain gaps by matching the histogram of test images to synthetic reference trajectories and removes time conditioning from the network to handle deviations in noise distributions. The approach is claimed to achieve state-of-the-art performance in segmenting critical biomarkers for two stages of age-related macular degeneration (AMD), with code released.

Significance. If the results hold, the method offers a practical test-time solution for adapting to device-specific variations in OCT without retraining, which could improve automated biomarker analysis in clinical settings with low-cost or inconsistent imaging hardware. The public code release supports reproducibility.

major comments (2)
  1. [Method (histogram matching step)] The core assumption that histogram matching to synthetic trajectories sufficiently aligns input distributions for the subsequent time-independent flow-matching denoising (described in the abstract and method) is load-bearing for the SOTA segmentation claim. However, OCT domain shifts typically involve spatially correlated speckle, motion artifacts, and device-specific point-spread functions beyond marginal intensity statistics; a global histogram transform cannot correct these, risking that the flow model denoises toward an incorrect manifold and propagates errors to biomarker segmentation.
  2. [Abstract and Experiments] The abstract asserts SOTA segmentation results for AMD biomarkers but provides no quantitative numbers, baselines, error bars, dataset details, or ablation studies. Without these, the central performance claim cannot be evaluated, and the manuscript must supply them with statistical rigor to support the adaptation method's effectiveness.
minor comments (1)
  1. [Method] Clarify the exact procedure for generating synthetic reference trajectories and how they are chosen to ensure they represent the expected training distribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below. Where the manuscript requires clarification or additional detail, we will revise accordingly.

read point-by-point responses
  1. Referee: [Method (histogram matching step)] The core assumption that histogram matching to synthetic trajectories sufficiently aligns input distributions for the subsequent time-independent flow-matching denoising (described in the abstract and method) is load-bearing for the SOTA segmentation claim. However, OCT domain shifts typically involve spatially correlated speckle, motion artifacts, and device-specific point-spread functions beyond marginal intensity statistics; a global histogram transform cannot correct these, risking that the flow model denoises toward an incorrect manifold and propagates errors to biomarker segmentation.

    Authors: We agree that histogram matching operates on marginal intensity statistics and does not explicitly model spatially correlated speckle, motion artifacts, or device-specific PSFs. Our design choice was motivated by the observation that the primary domain gap in the target low-cost OCT devices manifests as intensity distribution shifts; the subsequent time-independent flow-matching step is intended to provide robustness to residual deviations. We will add a limitations paragraph in the revised manuscript explicitly discussing the scope of histogram matching and the conditions under which spatially structured artifacts may remain unaddressed. No new experiments are planned for this revision. revision: partial

  2. Referee: [Abstract and Experiments] The abstract asserts SOTA segmentation results for AMD biomarkers but provides no quantitative numbers, baselines, error bars, dataset details, or ablation studies. Without these, the central performance claim cannot be evaluated, and the manuscript must supply them with statistical rigor to support the adaptation method's effectiveness.

    Authors: We acknowledge that the current abstract lacks the requested quantitative details. The full manuscript contains the numerical results (Dice scores, baselines, dataset descriptions, and ablations with error bars), but these were omitted from the abstract for brevity. We will revise the abstract to include the key performance numbers, dataset information, and a brief mention of the statistical evaluation. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation chain is self-contained and externally evaluated

full rationale

The paper presents a test-time adaptation pipeline (histogram matching of test OCT images to synthetic flow trajectories, followed by time-independent flow-matching denoising) whose central steps are defined independently of the final segmentation metrics. No equations, parameters, or claims reduce by construction to fitted inputs or self-citations; the SOTA biomarker segmentation results are reported on external AMD datasets and do not feed back into the method definition. This is the normal case of a non-circular empirical method paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated or derivable from the provided text.

pith-pipeline@v0.9.1-grok · 5676 in / 921 out tokens · 17886 ms · 2026-06-26T21:28:24.171905+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 7 canonical work pages

  1. [1]

    IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 10076–10095 (2024)

    Azad,R.,Aghdam,E.K.,Rauland,A.,Jia,Y.,Avval,A.H.,Bozorgpour,A.,Karim- ijafarbigloo, S., Cohen, J.P., Adeli, E., Merhof, D.: Medical image segmentation re- view: The success of U-Net. IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 10076–10095 (2024)

  2. [2]

    IEEE Transactions on Medical Imaging38(8), 1858– 1874 (2019)

    Bogunović,H.,Venhuizen,F.,Klimscha,S.,Apostolopoulos,S.,Bab-Hadiashar,A., Bagci, U., Beg, M.F., Bekalo, L., Chen, Q., Ciller, C., Gopinath, K., Gostar, A.K., Jeon, K., Ji, Z., Kang, S.H., Koozekanani, D.D., Lu, D., Morley, D., Parhi, K.K., Park, H.S., Rashno, A., Sarunic, M., Shaikh, S., Sivaswamy, J., Tennakoon, R., Yadav, S., De Zanet, S., Waldstein, S....

  3. [3]

    Scientific Reports p

    Eidenberger, A., Birner, K., Frank-Publig, S., Schrittwieser, J., Tratnig-Frankl, M., Gumpinger, M., Schmidt-Erfurth, U.: Comparison of choroidal hypertrans- mission and retinal pigment epithelium loss for quantification of geographic at- rophy across commonly used SD-OCT devices. Scientific Reports p. 7240 (2026). https://doi.org/10.1038/s41598-026-38182-7

  4. [4]

    In: International Workshop on Ophthalmic Medical Image Analysis

    Fazekas, B., Pinetz, T., Aresta, G., Emre, T., Bogunović, H.: GARD: Gamma- based anatomical restoration and denoising for retinal OCT. In: International Workshop on Ophthalmic Medical Image Analysis. pp. 32–42. Springer (2025) 10 Hucke et al

  5. [5]

    In: CVPR

    Gao, J., Zhang, J., Liu, X., Darrell, T., Shelhamer, E., Wang, D.: Back to the source: Diffusion-driven adaptation to test-time corruption. In: CVPR. pp. 11786– 11796 (2023)

  6. [6]

    Medical Image Analysis103, 103575 (2025)

    Gomariz, A., Kikuchi, Y., Li, Y.Y., Albrecht, T., Maunz, A., Ferrara, D., Lu, H., Goksel, O.: Joint semi-supervised and contrastive learning enables domain gen- eralization and multi-domain segmentation. Medical Image Analysis103, 103575 (2025)

  7. [7]

    Deep residual learning for image recognition,

    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  8. [8]

    NeurIPS30(2017)

    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. NeurIPS30(2017)

  9. [9]

    NeurIPS33, 6840–6851 (2020)

    Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. NeurIPS33, 6840–6851 (2020)

  10. [10]

    In: ICML

    Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: Cycle-consistent adversarial domain adaptation. In: ICML. vol. 80, pp. 1989–1998. PMLR (2018)

  11. [11]

    In: NeurIPS

    Li, M., Li, S., Su, T., Yuan, L., Liang, J., Li, W.: Exploring structured semantic priors underlying diffusion score for test-time adaptation. In: NeurIPS. vol. 37, pp. 12164–12194 (2024)

  12. [12]

    In: MICCAI

    Li, S., Higashita, R., Fu, H., Li, H., Niu, J., Liu, J.: Content-preserving diffusion model for unsupervised AS-OCT image despeckling. In: MICCAI. pp. 660–670. Springer (2023)

  13. [13]

    IEEE Journal of Biomed- ical and Health Informatics29(1), 248–258 (2025)

    Li, S., Higashita, R., Fu, H., Yang, B., Liu, J.: Score prior guided iterative solver for speckles removal in optical coherent tomography images. IEEE Journal of Biomed- ical and Health Informatics29(1), 248–258 (2025)

  14. [14]

    In: CVPR

    Li, T., He, K.: Back to basics: Let denoising generative models denoise. In: CVPR. pp. 36115–36125 (2026)

  15. [15]

    In: MICCAI

    Lin, X., Du, C., Wu, Q., Tian, X., Yu, J., Zhang, Y., Wei, H.: Zero-shot low- field MRI enhancement via denoising diffusion driven neural representation. In: MICCAI. pp. 775–785. Springer (2024)

  16. [16]

    In: ICLR (2023), https://openreview.net/forum?id=PqvMRDCJT9t

    Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: ICLR (2023), https://openreview.net/forum?id=PqvMRDCJT9t

  17. [17]

    Lipman, Y., Havasi, M., Holderrieth, P., Shaul, N., Le, M., Karrer, B., Chen, R.T.Q., Lopez-Paz, D., Ben-Hamu, H., Gat, I.: Flow matching guide and code (2024), https://arxiv.org/abs/2412.06264

  18. [18]

    In: ICLR

    Martin,S.,Gagneux,A.,Hagemann,P.,Steidl,G.:PnP-Flow:Plug-and-playimage restoration with flow matching. In: ICLR. pp. 45466–45492 (2025)

  19. [19]

    In: MICCAI

    Morano, J., Aresta, G., Lachinov, D., Mai, J., Schmidt-Erfurth, U., Bogunović, H.: Self-supervised learning via inter-modal reconstruction and feature projec- tion networks for label-efficient 3D-to-2D segmentation. In: MICCAI. pp. 589–599. Springer (2023). https://doi.org/10.1007/978-3-031-43901-8_56

  20. [20]

    In: MIDL

    Pinetz, T., Hucke, V., Bogunović, H.: Exploiting intermediate reconstructions in optical coherence tomography for test-time adaption of medical image segmenta- tion. In: MIDL. vol. 315, pp. 1081–1094. PMLR (2026)

  21. [21]

    Ronneberger, P

    Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: MICCAI. pp. 234–241. Springer (2015). https://doi.org/10.1007/978-3-319-24574-4_28 TTA-Flow 11

  22. [22]

    In: MICCAI

    Safdari, R., Nikouei Mahani, M.A., Koohi-Moghadam, M., Bae, K.T.: MixStyle- Flow: Domain generalization in medical image segmentation using normalizing flows. In: MICCAI. pp. 376–385. Springer (2025)

  23. [23]

    Sun, Q., Jiang, Z., Zhao, H., He, K.: Is noise conditioning necessary for denoising generative models? In: ICML (2025), https://openreview.net/forum?id=pTSWi6RTtJ

  24. [24]

    In: ICLR (2021), https://openreview.net/forum?id=uXl3bZLkr3c

    Wang, D., Shelhamer, E., Liu, S., Bruno, O., Darrell, T.: TENT: Fully test-time adaptation by entropy minimization. In: ICLR (2021), https://openreview.net/forum?id=uXl3bZLkr3c

  25. [25]

    A ConvNet for the 2020s , booktitle =

    Wang, Q., Fink, O., Van Gool, L., Dai, D.: Continual test-time domain adaptation. In:CVPR.pp.7191–7201(2022).https://doi.org/10.1109/CVPR52688.2022.00706

  26. [26]

    In: MICCAI

    Yazdani, M., Medghalchi, Y., Ashrafian, P., Hacihaliloglu, I., Shahriari, D.: Flow matching for medical image synthesis: Bridging the gap between speed and quality. In: MICCAI. pp. 216–226. Springer (2025)

  27. [27]

    Yi, C., Chen, H., Zhang, Y., Xu, Y., Zhou, Y., Cui, L.: From ques- tion to exploration: Can classic test-time adaptation strategies be effectively applied in semantic segmentation? In: ACMMM. pp. 10085–10094 (2024). https://doi.org/10.1145/3664647.3680910

  28. [28]

    In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support

    Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: A nested U-Net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. pp. 3–11. Springer (2018)