DreamReg: Belief-Driven World Model for 2D-3D Ultrasound Registration
Pith reviewed 2026-06-26 21:47 UTC · model grok-4.3
The pith
DreamReg registers 2D ultrasound slices to 3D volumes by maintaining a latent belief state over rigid transformations and updating it through internal simulation of probe motions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DreamReg formulates 2D-3D ultrasound registration as belief updating over rigid transformations. It maintains a latent belief state that summarizes past observations and poses information, and continuously refines the transformation through learned dynamics as new slices arrive. During inference, DreamReg refines registration via internal imagination: it rolls out the learned world model to simulate candidate probe motions and their predicted observations, and integrates these imagined outcomes to converge to an accurate rigid transformation.
What carries the argument
Latent belief state over rigid transformations, updated by conditioning pose refinement on the current US observation and on simulated future observations from the learned dynamics model.
If this is right
- Registration accuracy improves as additional slices are acquired because the belief state accumulates evidence rather than depending on any single observation.
- The method can accommodate action-dependent acquisition by internally testing how different probe adjustments would affect the observed image.
- Experiments on the CAMUS and u-RegPro datasets show competitive accuracy and greater robustness than prior registration techniques.
- Real-time guidance becomes feasible because the system converges through repeated internal roll-outs without requiring exhaustive external search at each step.
Where Pith is reading between the lines
- The same belief-plus-simulation structure could be applied to other medical imaging tasks where the sensor can be moved deliberately, such as freehand 3D reconstruction or catheter tracking.
- If the world model generalizes across patients, training data requirements might shrink because the system learns predictive dynamics rather than memorizing appearance templates.
- Clinical workflows might change if operators learn to move the probe in ways that the model can most easily disambiguate.
Load-bearing premise
The dynamics model trained on clinical-style trajectories will correctly predict the ultrasound images that would result from probe motions and patient anatomies never seen in training.
What would settle it
Apply the trained model to a held-out patient or to probe trajectories that differ markedly from the training distribution and measure whether the final registration error exceeds that of standard one-shot or short-horizon baselines.
Figures
read the original abstract
Ultrasound (US) is widely used for surgical navigation, yet real-time registration between intraoperative 2D slices and preoperative 3D volumes remains challenging due to partial observability, speckle noise, and the action-dependent US acquisition. Existing methods are one-shot or short-horizon, making it hard for them to gather evidence over time or capture how surgeons adjust probe motion based on on-screen feedback. We propose DreamReg, a belief-driven world-model framework that formulates 2D-3D registration as belief updating over rigid transformations. DreamReg maintains a latent belief state that summarizes past observations and poses information, and continuously refines the transformation through learned dynamics as new slices arrive. During training, DreamReg is exposed to probe-motion trajectories that mimic clinical scanning behavior and learns to update its belief by conditioning pose refinement on the current US observation. During inference, DreamReg refines registration via internal imagination: it rolls out the learned world model to simulate candidate probe motions and their predicted observations, and integrates these imagined outcomes to converge to an accurate rigid transformation. Experiments on CAMUS and u-RegPro datasets demonstrate improved robustness and competitive registration accuracy for real-time guidance compared with state-of-the-art methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DreamReg, a belief-driven world-model framework for 2D-3D ultrasound registration. It maintains a latent belief state summarizing past observations and poses, refines the rigid transformation via learned dynamics conditioned on new US slices, and during inference rolls out the world model to simulate candidate probe motions and integrate imagined outcomes for convergence. Training uses probe-motion trajectories mimicking clinical scanning; experiments on CAMUS and u-RegPro datasets are claimed to demonstrate improved robustness and competitive accuracy versus state-of-the-art methods.
Significance. If the central claim holds, the approach would address key limitations of one-shot or short-horizon registration methods by enabling evidence accumulation over time in the presence of partial observability, speckle noise, and action-dependent acquisition, potentially improving robustness for real-time surgical navigation.
major comments (2)
- Abstract: the claim of 'improved robustness' and 'competitive registration accuracy' on CAMUS and u-RegPro is stated without any quantitative metrics, error bars, ablation studies, or baseline comparisons, so the central empirical claim cannot be assessed from the provided text.
- Inference procedure (abstract description): the load-bearing assumption that the learned dynamics model produces faithful predicted observations for arbitrary unseen probe motions and new anatomies during internal rollout is not supported by any held-out prediction-error metrics, ablation of rollout versus direct regression, or generalization tests, leaving the belief-update convergence claim unverified.
Simulated Author's Rebuttal
We thank the referee for the thoughtful comments on our manuscript. We address each major comment point-by-point below and outline the revisions we will incorporate.
read point-by-point responses
-
Referee: Abstract: the claim of 'improved robustness' and 'competitive registration accuracy' on CAMUS and u-RegPro is stated without any quantitative metrics, error bars, ablation studies, or baseline comparisons, so the central empirical claim cannot be assessed from the provided text.
Authors: We agree that the abstract would be strengthened by including brief quantitative support for the claims. The full manuscript reports detailed results with metrics, error bars, and baseline comparisons in the Experiments section. We will revise the abstract to include key quantitative highlights (e.g., mean registration errors and relative improvements) while remaining within length constraints. revision: yes
-
Referee: Inference procedure (abstract description): the load-bearing assumption that the learned dynamics model produces faithful predicted observations for arbitrary unseen probe motions and new anatomies during internal rollout is not supported by any held-out prediction-error metrics, ablation of rollout versus direct regression, or generalization tests, leaving the belief-update convergence claim unverified.
Authors: The dynamics model is trained end-to-end on probe trajectories that include held-out sequences, and its effectiveness is demonstrated indirectly via downstream registration accuracy. However, we acknowledge the absence of explicit held-out prediction-error metrics or rollout-specific ablations in the current version. We will add a dedicated analysis of prediction fidelity, an ablation comparing rollout-based inference to direct regression, and generalization tests on unseen anatomies in the revised manuscript. revision: yes
Circularity Check
No circularity; framework is data-driven with independent training and inference stages
full rationale
The paper formulates registration as belief updating in a learned world model trained on clinical-mimic trajectories. No equations, fitted parameters, or self-citations are shown that reduce the claimed inference-time rollouts or belief refinements to the training inputs by construction. The derivation chain consists of standard supervised learning on observed data followed by model rollout, which is self-contained against external benchmarks and does not match any enumerated circularity pattern.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Journal of Ultrasound in Medicine35(1), 183–188 (2016)
Bahner, D.P., Blickendorf, J.M., Bockbrader, M., Adkins, E., Vira, A., Boulger, C., Panchal, A.R.: Language of transducer manipulation: codifying terms for effective teaching. Journal of Ultrasound in Medicine35(1), 183–188 (2016)
2016
-
[2]
Baum, Z.M.C., Saeed, S.U., Min, Z., Hu, Y., Barratt, D.C.: MR to ultrasound registration for prostate challenge (2023)
2023
-
[3]
In: International Workshop on Biomedical Image Registration
Brandstätter, S., Seeböck, P., Fürböck, C., Pochepnia, S., Prosch, H., Langs, G.: Rigid single-slice-in-volume registration via rotation-equivariant 2d/3d feature matching. In: International Workshop on Biomedical Image Registration. pp. 280–
-
[4]
Advances in Neural Information Processing Systems35, 11079–11091 (2022)
Bulatov, A., Kuratov, Y., Burtsev, M.: Recurrent memory transformer. Advances in Neural Information Processing Systems35, 11079–11091 (2022)
2022
-
[5]
Medical image analysis39, 101–123 (2017)
Ferrante, E., Paragios, N.: Slice-to-volume medical image registration: A survey. Medical image analysis39, 101–123 (2017)
2017
-
[6]
Communi- cations of the ACM24(6), 381–395 (1981)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communi- cations of the ACM24(6), 381–395 (1981)
1981
-
[7]
Applied Sciences12(13), 6562 (2022)
Giangrossi, C., et al.: Requirements and hardware limitations of high-frame-rate 3-d ultrasound imaging systems. Applied Sciences12(13), 6562 (2022)
2022
-
[8]
Medical physics 44(9), 4708–4723 (2017)
Gillies, D.J., Gardi, L., De Silva, T., Zhao, S.r., Fenster, A.: Real-time registration of 3d to 2d ultrasound images for image-guided prostate biopsy. Medical physics 44(9), 4708–4723 (2017)
2017
-
[9]
Computers in biology and medicine155, 106389 (2023) 10 L
Gong, H., Chen, J., Chen, G., Li, H., Li, G., Chen, F.: Thyroid region prior guided attention for ultrasound segmentation of thyroid nodules. Computers in biology and medicine155, 106389 (2023) 10 L. Kanget al
2023
-
[10]
In: Interna- tional Conference on Medical Image Computing and Computer-Assisted Interven- tion
Guo, H., et al.: End-to-end ultrasound frame to volume registration. In: Interna- tional Conference on Medical Image Computing and Computer-Assisted Interven- tion. pp. 56–65. Springer (2021)
2021
-
[11]
arXiv preprint arXiv:1803.101222(3), 440 (2018)
Ha, D., Schmidhuber, J.: World models. arXiv preprint arXiv:1803.101222(3), 440 (2018)
Pith/arXiv arXiv 2018
-
[12]
In: International Conference on Learning Representations
Hafner, D., Lillicrap, T., Ba, J., Norouzi, M.: Dream to control: Learning behaviors by latent imagination. In: International Conference on Learning Representations
-
[13]
arXiv preprint arXiv:2010.02193 (2020)
Hafner, D., Lillicrap, T., Norouzi, M., Ba, J.: Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193 (2020)
Pith/arXiv arXiv 2010
-
[14]
Com- puters in Biology and Medicine195, 110450 (2025)
Hidalgo, E.M., et al.: Evaluating the impacts of network latency, haptics, and ergonomics in a haptically-enabled robot for teleoperated echocardiography. Com- puters in Biology and Medicine195, 110450 (2025)
2025
-
[15]
Medical image analysis16(3), 687–703 (2012)
Hu, Y., Ahmed, H.U., Taylor, Z., Allen, C., Emberton, M., Hawkes, D., Barratt, D.: Mr to ultrasound registration for image-guided prostate interventions. Medical image analysis16(3), 687–703 (2012)
2012
-
[16]
IEEE trans- actions on medical imaging38(9), 2198–2210 (2019)
Leclerc, S., Smistad, E., Pedrosa, J., Østvik, A., Cervenansky, F., Espinosa, F., Espeland, T., Berg, E.A.R., Jodoin, P.M., Grenier, T., et al.: Deep learning for seg- mentation using an open large-scale dataset in 2d echocardiography. IEEE trans- actions on medical imaging38(9), 2198–2210 (2019)
2019
-
[17]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Lei, L., Zhou, J., Pei, J., Zhao, B., Jin, Y., Teoh, Y.C.J., Qin, J., Heng, P.A.: Epi- cardium prompt-guided real-time cardiac ultrasound frame-to-volume registration. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 618–628. Springer (2024)
2024
-
[18]
Engineering5(2), 261–275 (2019)
Liu, S., Wang, Y., Yang, X., Lei, B., Liu, L., Li, S.X., Ni, D., Wang, T.: Deep learning in medical ultrasound analysis: a review. Engineering5(2), 261–275 (2019)
2019
-
[19]
arXiv preprint arXiv:1711.05101 (2017)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Pith/arXiv arXiv 2017
-
[20]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Lu, G., Jia, B., Li, P., Chen, Y., Wang, Z., Tang, Y., Huang, S.: Gwm: Towards scalable gaussian world models for robotic manipulation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 9263–9274 (2025)
2025
-
[21]
In: International confer- ence on medical image computing and computer-assisted intervention
Markova, V., Ronchetti, M., Wein, W., Zettinig, O., Prevost, R.: Global multi- modal 2d/3d registration via local descriptors learning. In: International confer- ence on medical image computing and computer-assisted intervention. pp. 269–279. Springer (2022)
2022
-
[22]
arXiv preprint arXiv:2602.03569 (2026)
Mu, L., Huang, Z., Gu, Y., Qin, S., Zhang, S., Zhang, X.: Ehrworld: A patient- centric medical world model for long-horizon clinical trajectories. arXiv preprint arXiv:2602.03569 (2026)
arXiv 2026
-
[23]
The Ultrasound Journal15(1), 19 (2023)
Mulder, T.A., van de Velde, T., Dokter, E., Boekestijn, B., Olgers, T.J., Bauer, M.P., Hierck, B.P.: Unravelling the skillset of point-of-care ultrasound: a systematic review. The Ultrasound Journal15(1), 19 (2023)
2023
-
[24]
Surgical endoscopy38(5), 2359– 2370 (2024)
Pavone, M., Seeliger, B., Teodorico, E., Goglia, M., Taliento, C., Bizzarri, N., Lecointre, L., Akladios, C., Forgione, A., Scambia, G., et al.: Ultrasound-guided robotic surgical procedures: a systematic review. Surgical endoscopy38(5), 2359– 2370 (2024)
2024
-
[25]
International Journal of Research in Engineering and Technology3(5), 12–16 (2014)
Rao, Y.R., Prathapani, N., Nagabhooshanam, E.: Application of normalized cross correlation to image registration. International Journal of Research in Engineering and Technology3(5), 12–16 (2014)
2014
-
[26]
International journal of computer assisted radiology and surgery 17(10), 1765–1773 (2022) DreamReg: Belief-Driven World Model for 2D–3D Ultrasound Registration 11
Smit, J.N., et al.: Ultrasound-based navigation for open liver surgery using active liver tracking. International journal of computer assisted radiology and surgery 17(10), 1765–1773 (2022) DreamReg: Belief-Driven World Model for 2D–3D Ultrasound Registration 11
2022
-
[27]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention
Wang, H., Wang, Y.: Eureg: End-to-end framework for efficient 2d-3d ultra- sound registration. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 175–185. Springer (2025)
2025
-
[28]
IEEE transactions on image processing 13(4), 600–612 (2004)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13(4), 600–612 (2004)
2004
-
[29]
IEEE transactions on ultrasonics, ferroelectrics, and frequency control62(2), 319–328 (2015)
Wei, C.W., et al.: Real-time integrated photoacoustic and ultrasound (paus) imag- ing system to guide interventional procedures: ex vivo study. IEEE transactions on ultrasonics, ferroelectrics, and frequency control62(2), 319–328 (2015)
2015
-
[30]
International Journal of Computer Assisted Radiology and Surgery20(10), 2107– 2117 (2025)
Weld, A., Dixon, L., Dyck, M., Anichini, G., Ranne, A., Camp, S., Giannarou, S.: Identifying visible tissue in intraoperative ultrasound: a method and application. International Journal of Computer Assisted Radiology and Surgery20(10), 2107– 2117 (2025)
2025
-
[31]
In: Conference on robot learning
Wu, P., Escontrela, A., Hafner, D., Abbeel, P., Goldberg, K.: Daydreamer: World models for physical robot learning. In: Conference on robot learning. pp. 2226–
-
[32]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Yang, Y., Wang, Z.Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.D., et al.: Medical world model. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 8319–8329 (2025)
2025
-
[33]
Medical Image Analysis70, 101998 (2021)
Yeung, P.H., Aliasi, M., Papageorghiou, A.T., Haak, M., Xie, W., Namburete, A.I.: Learning to map 2d ultrasound images into 3d space with minimal human annotation. Medical Image Analysis70, 101998 (2021)
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.