arxiv: 2604.11636 · v1 · submitted 2026-04-13 · 💻 cs.CV

Recognition: unknown

MorphoFlow: Sparse-Supervised Generative Shape Modeling with Adaptive Latent Relevance

Mokshagna Sai Teja Karanam , Tushar Kataria , Shireen Elhabian

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:02 UTC · model grok-4.3

classification 💻 cs.CV

keywords statistical shape modelingsparse supervisionneural implicit representationsautoregressive flowsgenerative modelinganatomical variability3D reconstruction

0 comments

The pith

MorphoFlow learns compact probabilistic 3D anatomical shape representations directly from sparse surface annotations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Statistical shape modeling has long depended on dense segmentations that are expensive to create at scale. MorphoFlow instead trains on sparse surface points by pairing neural implicit representations for resolution-independent geometry with an autodecoder that optimizes per-instance latent codes. Autoregressive normalizing flows then learn the distribution over those codes, while sparsity-inducing priors adaptively weight latent dimensions according to their relevance to anatomical variation. The result is a generative model that produces plausible shapes, quantifies uncertainty, and recovers population-level modes without manual dimensionality tuning. Experiments on lumbar vertebrae and femur data confirm accurate high-resolution outputs and structured variation recovery.

Core claim

MorphoFlow integrates neural implicit shape representations with an autodecoder formulation and autoregressive normalizing flows to learn an expressive probabilistic density over the latent shape space directly from sparse surface annotations, with adaptive latent relevance weighting through sparsity-inducing priors that regulates the contribution of individual latent dimensions according to their relevance to underlying anatomical variation.

What carries the argument

The adaptive latent relevance weighting via sparsity-inducing priors, which works inside the combined neural-implicit autodecoder and autoregressive-flow architecture to produce compact, structured latent spaces from sparse supervision.

If this is right

High-resolution 3D reconstructions remain accurate even when only sparse surface points are provided.
Structured modes of anatomical variation emerge that align with population-level trends.
The latent space yields a tractable likelihood for generative shape synthesis and uncertainty estimates.
No manual selection of latent dimensionality is required to maintain expressivity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sparse-supervision strategy could lower annotation costs for other 3D medical structures beyond vertebrae and femurs.
The probabilistic output could feed directly into downstream tasks such as registration or surgical planning.
Similar adaptive relevance mechanisms might improve compactness in other latent-variable shape models.

Load-bearing premise

Sparse surface annotations contain enough information for the combined implicit representation, autodecoder, and autoregressive flow to recover accurate, expressive, and anatomically plausible full 3D shapes and their population distribution without dense supervision.

What would settle it

A direct comparison on the lumbar vertebrae or femur datasets in which shapes reconstructed or sampled from the sparse-trained model deviate substantially from dense ground-truth surfaces or from known population statistics would falsify the claim.

Figures

Figures reproduced from arXiv: 2604.11636 by Mokshagna Sai Teja Karanam, Shireen Elhabian, Tushar Kataria.

**Figure 1.** Figure 1: MorphoFlow Architecture. For only generative shape modelling α −1 = 1, making the distribution a standard normal. For compact shape latent α −1 values are learned during training with ARD regularization loss. is particularly advantageous in medical imaging, where segmentation masks are often sparse, anisotropic, and acquired at heterogeneous resolutions. Each training shape is associated with a latent code… view at source ↗

**Figure 2.** Figure 2: Results. (A) Best, median, and worst surface reconstructions obtained from models trained on thin, thick, and orthogonal slices(latent dimension=64). The right column shows models trained with MorphoFlow, while the left column shows models trained without Flow and ARD regularization. (B) Estimated σ values across different latent dimensionalities when using MorphoFlow showing the impact of ARD regularizat… view at source ↗

read the original abstract

Statistical shape modeling (SSM) is central to population level analysis of anatomical variability, yet most existing approaches rely on densely annotated segmentations and fixed latent representations. These requirements limit scalability and reduce flexibility when modeling complex anatomical variation. We introduce MorphoFlow, a sparse supervised generative shape modeling framework that learns compact probabilistic shape representations directly from sparse surface annotations. MorphoFlow integrates neural implicit shape representations with an autodecoder formulation and autoregressive normalizing flows to learn an expressive probabilistic density over the latent shape space. The neural implicit representation enables resolution-agnostic modeling of 3D anatomy, while the autodecoder formulation supports direct optimization of per-instance latent codes under sparse supervision. The autoregressive flow captures the distribution of latent anatomical variability providing a tractable, likelihood-based generative model of shapes. To promote compact and structured latent representations, we incorporate adaptive latent relevance weighting through sparsity-inducing priors, enabling the model to regulate the contribution of individual latent dimensions according to their relevance to the underlying anatomical variation while preserving generative expressivity. The resulting latent space supports uncertainty quantification and anatomically plausible shape synthesis without manual latent dimensionality tuning. Evaluation on publicly available lumbar vertebrae and femur datasets demonstrates accurate high-resolution reconstruction from sparse inputs and recovery of structured modes of anatomical variation consistent with population level trends.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MorphoFlow puts together autodecoders, autoregressive flows, and a sparsity prior to enable generative SSM from sparse surface points, but the identifiability of the implicit shapes from those points alone remains the open question.

read the letter

MorphoFlow's core idea is to learn a generative model over anatomical shapes using only sparse surface annotations instead of dense segmentations. It does this by pairing a neural implicit representation with an autodecoder that optimizes per-instance latents under sparse loss, then modeling the resulting latent distribution with autoregressive normalizing flows, and adding an adaptive relevance prior to keep the latent dimensions compact and relevant without manual tuning. The result is meant to support high-resolution reconstruction, shape synthesis, and uncertainty estimates while cutting annotation effort on datasets like lumbar vertebrae and femurs. That combination is new enough in the SSM literature to be worth noticing, and the paper executes the integration cleanly on paper. The autodecoder plus flow setup gives a tractable likelihood model, and the sparsity prior addresses a real pain point around latent dimensionality. If the experiments show stable reconstructions and population modes that align with known anatomy, the practical payoff for medical imaging workflows is clear. The soft spot is the one the stress test highlights. Sparse surface points do not uniquely determine an implicit function; many different level sets can agree on those points. Without explicit off-surface samples, normal supervision, or penalties such as Eikonal constraints, the optimized implicits can be degenerate or inconsistent across shapes. The adaptive prior only regularizes the latents after the decoder has done its work, so it does not solve the base fitting problem. The full paper needs to show exactly what constraints were added and provide quantitative metrics against dense baselines with error bars. This work is aimed at researchers who build statistical shape models for medical applications and want to reduce labeling costs. A reader already familiar with implicit representations and flows will see how the pieces are assembled and can judge whether the sparse-supervision claim holds. It deserves a serious referee because the problem is practical, the architecture is coherent, and the evaluation on public data can be checked directly.

Referee Report

3 major / 0 minor

Summary. The paper introduces MorphoFlow, a generative framework for statistical shape modeling of 3D anatomy that learns compact probabilistic representations directly from sparse surface annotations. It combines neural implicit shape representations (via an autodecoder for per-instance latent codes) with autoregressive normalizing flows over the latent space and adaptive latent relevance weighting via sparsity-inducing priors. The central claims are that this enables resolution-agnostic high-resolution reconstruction from sparse inputs and recovery of structured, anatomically plausible modes of population-level variation, as demonstrated on public lumbar vertebrae and femur datasets.

Significance. If the performance claims are substantiated with quantitative evidence, the work would address a key scalability bottleneck in statistical shape modeling by removing the requirement for dense segmentations while providing a likelihood-based generative model with uncertainty quantification and automatic latent dimensionality control.

major comments (3)

[Abstract] Abstract: the evaluation is described only in qualitative terms ('accurate high-resolution reconstruction' and 'recovery of structured modes ... consistent with population level trends') with no reported metrics, baselines, error bars, or cross-validation details, leaving the central empirical claims without verifiable support.
[Method] The method relies on optimizing neural implicit representations solely under a sparse surface loss; because infinitely many implicit functions agree on a sparse set of surface points, the autodecoder step risks non-unique or degenerate solutions unless off-surface supervision, normals, or explicit regularizers (e.g., Eikonal) are employed. The adaptive latent relevance prior operates only on the latent codes and does not resolve this base identifiability issue for the implicit decoder.
[Experiments] The claim that the autoregressive flow recovers 'structured modes of anatomical variation' requires explicit demonstration that the flow adds expressive power beyond the autodecoder alone, including quantitative comparison to standard SSM baselines and verification that the recovered modes align with known anatomical trends rather than artifacts of the sparse supervision.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. We provide point-by-point responses to the major comments and describe the changes we will implement in the revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the evaluation is described only in qualitative terms ('accurate high-resolution reconstruction' and 'recovery of structured modes ... consistent with population level trends') with no reported metrics, baselines, error bars, or cross-validation details, leaving the central empirical claims without verifiable support.

Authors: We acknowledge that the abstract relies on qualitative descriptions. The detailed quantitative results, including metrics, baselines, error bars, and cross-validation procedures, are presented in the Experiments section of the manuscript. We will revise the abstract to incorporate key quantitative highlights from the evaluation to provide verifiable support for the claims. revision: yes
Referee: [Method] The method relies on optimizing neural implicit representations solely under a sparse surface loss; because infinitely many implicit functions agree on a sparse set of surface points, the autodecoder step risks non-unique or degenerate solutions unless off-surface supervision, normals, or explicit regularizers (e.g., Eikonal) are employed. The adaptive latent relevance prior operates only on the latent codes and does not resolve this base identifiability issue for the implicit decoder.

Authors: The referee raises an important point about the potential for non-unique solutions when supervising implicit functions with only sparse surface points. Our current approach does not include additional regularizers such as the Eikonal constraint. We will address this by incorporating an Eikonal loss and sampling off-surface points during training of the autodecoder. This addition will be described in the revised Methods section, and we will discuss its impact on solution uniqueness. revision: yes
Referee: [Experiments] The claim that the autoregressive flow recovers 'structured modes of anatomical variation' requires explicit demonstration that the flow adds expressive power beyond the autodecoder alone, including quantitative comparison to standard SSM baselines and verification that the recovered modes align with known anatomical trends rather than artifacts of the sparse supervision.

Authors: We agree that stronger quantitative evidence is required to demonstrate the added value of the autoregressive flow. While the manuscript provides qualitative results and some baseline comparisons, we will expand the Experiments section with ablations that directly compare the full MorphoFlow model to the autodecoder without the flow, using quantitative metrics such as reconstruction accuracy and likelihood scores. We will also add comparisons to standard statistical shape modeling baselines and provide analysis linking the learned modes to established anatomical trends in the literature. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation composes standard components

full rationale

The paper defines MorphoFlow by combining neural implicit representations, an autodecoder for per-instance latent optimization under sparse supervision, autoregressive normalizing flows for the latent density, and an adaptive relevance prior. None of these steps reduce a claimed output (reconstruction accuracy or recovered population modes) to an input quantity by construction, nor do they rely on load-bearing self-citations whose validity is internal to the present work. The abstract and described framework cite established techniques without tautological redefinition or fitted-input-as-prediction patterns. Evaluation claims rest on external public datasets rather than internal equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no explicit free parameters, axioms, or invented entities are detailed. The approach implicitly relies on standard assumptions that neural implicit functions can represent complex anatomy and that autoregressive flows can model latent distributions; no new entities are postulated.

pith-pipeline@v0.9.0 · 5536 in / 1207 out tokens · 92476 ms · 2026-05-10T15:02:35.169170+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 4 canonical work pages · 1 internal anchor

[1]

arXiv preprint arXiv:2405.09707 (2024)

Adams, J., Elhabian, S.: Point2ssm++: Self-supervised learning of anatomical shape models from point clouds. arXiv preprint arXiv:2405.09707 (2024)

work page arXiv 2024
[2]

Elhabian, S.: Weakly supervised bayesian shape modeling fromunsegmentedmedicalimages.In:InternationalWorkshoponShapeinMedical Imaging

Adams, J., Iyer, K., Y. Elhabian, S.: Weakly supervised bayesian shape modeling fromunsegmentedmedicalimages.In:InternationalWorkshoponShapeinMedical Imaging. pp. 1–17. Springer (2024)

2024
[3]

Medical image analysis94, 103099 (2024)

Amiranashvili, T., Lüdke, D., Li, H.B., Zachow, S., Menze, B.H.: Learning contin- uous shape priors from sparse data with neural implicit functions. Medical image analysis94, 103099 (2024)

2024
[4]

Medical image analysis91, 103034 (2024)

Bhalodia, R., Elhabian, S., Adams, J., Tao, W., Kavan, L., Whitaker, R.: Deepssm: A blueprint for image-to-shape deep learning models. Medical image analysis91, 103034 (2024)

2024
[5]

In: Statistical shape and deformation analysis, pp

Cates, J., Elhabian, S., Whitaker, R.: Shapeworks: particle-based shape correspon- dence and visualization software. In: Statistical shape and deformation analysis, pp. 257–298. Elsevier (2017)

2017
[6]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 5939–5948 (2019)

2019
[7]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Cheng, J., Fu, B., Ye, J., Wang, G., Li, T., Wang, H., Li, R., Yao, H., Cheng, J., Li, J., et al.: Interactive medical image segmentation: A benchmark dataset and baseline. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 20841–20851 (2025)

2025
[8]

Density estimation using Real NVP

Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016)

work page internal anchor Pith review arXiv 2016
[9]

Ad- vances in neural information processing systems32(2019)

Durkan, C., Bekasov, A., Murray, I., Papamakarios, G.: Neural spline flows. Ad- vances in neural information processing systems32(2019)

2019
[10]

Strahlentherapie und Onkologie201(3), 236–254 (2025)

Erdur, A.C., Rusche, D., Scholz, D., Kiechle, J., Fischer, S., Llorian-Salvador, O., Buchner, J.A., Nguyen, M.Q., Etzel, L., Weidner, J., et al.: Deep learning for autosegmentation for radiotherapy treatment planning: State-of-the-art and novel perspectives. Strahlentherapie und Onkologie201(3), 236–254 (2025)

2025
[11]

In: 2019 ninth international conference on image processing theory, tools and applications (IPTA)

Goceri, E.: Challenges and recent solutions for image segmentation in the era of deep learning. In: 2019 ninth international conference on image processing theory, tools and applications (IPTA). pp. 1–6. IEEE (2019)

2019
[12]

In: International conference on machine learning

Huang, C.W., Krueger, D., Lacoste, A., Courville, A.: Neural autoregressive flows. In: International conference on machine learning. pp. 2078–2087. PMLR (2018)

2078
[13]

In: International Conference on Medical Image Computing and Computer-Assisted Intervention

Iyer, K., Elhabian, S.Y.: Mesh2ssm: From surface meshes to statistical shape mod- els of anatomy. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 615–625. Springer (2023)

2023
[14]

arXiv preprint arXiv:2502.07145 (2025)

Iyer, K., Karanam, M.S.T., Elhabian, S.: Mesh2ssm++: A probabilistic framework for unsupervised learning of statistical shape model of anatomies from surface meshes. arXiv preprint arXiv:2502.07145 (2025)

work page arXiv 2025
[15]

Neurocomputing468, 492–509 (2022)

Khan, R.A., Luo, Y., Wu, F.X.: Machine learning based liver disease diagnosis: A systematic review. Neurocomputing468, 492–509 (2022)

2022
[16]

In: Proceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine

Laga, H.: Statistical 3d and 4d shape analysis: Theory and applications in the era of generative ai. In: Proceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine. pp. 5–6 (2024)

2024
[17]

IEEE transactions on pattern analysis and machine intelligence40(8), 1860–1873 (2017) 10 M

Lüthi, M., Gerig, T., Jud, C., Vetter, T.: Gaussian process morphable models. IEEE transactions on pattern analysis and machine intelligence40(8), 1860–1873 (2017) 10 M. Karanam et al

2017
[18]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4460–4470 (2019)

2019
[19]

IEEE transactions on medical imaging37(2), 384–395 (2017)

Oktay, O., Ferrante, E., Kamnitsas, K., Heinrich, M., Bai, W., Caballero, J., Cook, S.A., De Marvao, A., Dawes, T., O‘Regan, D.P., et al.: Anatomically constrained neural networks (acnns): application to cardiac image enhancement and segmen- tation. IEEE transactions on medical imaging37(2), 384–395 (2017)

2017
[20]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Park,J.J.,Florence,P.,Straub,J.,Newcombe,R.,Lovegrove,S.:Deepsdf:Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 165– 174 (2019)

2019
[21]

Journal of Orthopaedic Research®40(12), 2873–2884 (2022)

Peiffer, M., Burssens, A., De Mits, S., Heintz, T., Van Waeyenberge, M., Buedts, K., Victor, J., Audenaert, E.: Statistical shape model-based tibiofibular assess- ment of syndesmotic ankle lesions using weight-bearing ct. Journal of Orthopaedic Research®40(12), 2873–2884 (2022)

2022
[22]

In: Inter- national conference on machine learning

Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: Inter- national conference on machine learning. pp. 1530–1538. PMLR (2015)

2015
[23]

Inter- national Journal of Computer Assisted Radiology and Surgery20(9), 1863–1873 (2025)

Ryabtsev, A., Lederman, R., Sosna, J., Joskowicz, L.: Streamlining the annotation process by radiologists of volumetric medical images with few-shot learning. Inter- national Journal of Computer Assisted Radiology and Surgery20(9), 1863–1873 (2025)

2025
[24]

In: 2025 IEEE/CVF Win- ter Conference on Applications of Computer Vision (WACV)

Saha, S., Joshi, S., Whitaker, R.: Ard-vae: A statistical formulation to find the relevant latent dimensions of variational autoencoders. In: 2025 IEEE/CVF Win- ter Conference on Applications of Computer Vision (WACV). pp. 889–898. IEEE (2025)

2025
[25]

IEEE transactions on pattern analysis and machine intelligence45(8), 9284–9305 (2023)

Shen, W., Peng, Z., Wang, X., Wang, H., Cen, J., Jiang, D., Xie, L., Yang, X., Tian, Q.: A survey on label-efficient deep image segmentation: Bridging the gap between weak supervision and dense prediction. IEEE transactions on pattern analysis and machine intelligence45(8), 9284–9305 (2023)

2023
[26]

Computer Optics41(6), 897–904 (2017)

Smelkina, N.A., Kosarev, R.N., Nikonorov, A.V., Bairikov, I.M., Ryabov, K.N., Avdeev, E., Kazanskii, N.L.: Reconstruction of anatomical structures using statis- tical shape modeling. Computer Optics41(6), 897–904 (2017)

2017
[27]

In- ternational Journal of Computer Vision128(5), 1162–1181 (2020)

Stutz, D., Geiger, A.: Learning 3d shape completion under weak supervision. In- ternational Journal of Computer Vision128(5), 1162–1181 (2020)

2020
[28]

Visual Intelligence2(1), 9 (2024)

Sun, J.M., Wu, T., Gao, L.: Recent advances in implicit representation-based 3d shape generation. Visual Intelligence2(1), 9 (2024)

2024
[29]

arXiv preprint arXiv:2407.15260 (2024)

Ukey, J., Kataria, T., Elhabian, S.Y.: Weakly ssm: on the viability of weakly supervised segmentations for statistical shape modeling. arXiv preprint arXiv:2407.15260 (2024)

work page arXiv 2024
[30]

In: European Conference on Computer Vision

Wong, H.E., Rakic, M., Guttag, J., Dalca, A.V.: Scribbleprompt: fast and flexible interactive segmentation for any biomedical image. In: European Conference on Computer Vision. pp. 207–229. Springer (2024)

2024
[31]

Electronics12(5), 1199 (2023)

Yu, Y., Wang, C., Fu, Q., Kou, R., Huang, F., Yang, B., Yang, T., Gao, M.: Techniques and challenges of image segmentation: A review. Electronics12(5), 1199 (2023)

2023
[32]

Advanced Engineering Informatics69, 104104 (2026)

Zhang, X., Ying, S., Zhai, L., Chen, Y.: Weakly supervised learning for inter- pretable 3d shape similarity metrics on automated design compliance in building morphology management. Advanced Engineering Informatics69, 104104 (2026)

2026