pith. machine review for the scientific record. sign in

arxiv: 2604.09411 · v1 · submitted 2026-04-10 · 💻 cs.CV

Recognition: unknown

SynFlow: Scaling Up LiDAR Scene Flow Estimation with Synthetic Data

Chenhan Jiang, Patric Jensfelt, Qingwen Zhang, Xiaomeng Zhu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:41 UTC · model grok-4.3

classification 💻 cs.CV
keywords LiDAR scene flowsynthetic datazero-shot generalizationmotion estimation3D perceptiondomain invariancelabel efficiency
0
0 comments X

The pith

Models trained only on synthetic LiDAR scene flow data match or beat real supervised baselines on multiple benchmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that the scarcity of dense real-world motion labels blocks progress on reliable 3D dynamic perception from LiDAR, and that a large synthetic dataset focused on kinematic variety can supply the missing priors. It introduces a generation pipeline that produces 4,000 sequences and roughly 940,000 frames of annotated scene flow, 34 times the volume of existing real benchmarks. Models trained exclusively on this synthetic collection generalize zero-shot to real test sets, reaching performance comparable to in-domain supervised training on nuScenes and exceeding prior state-of-the-art on TruckScenes by 31.8 percent. The same data also serves as an efficient foundation: fine-tuning on just 5 percent of real labels outperforms models trained from scratch on the entire real budget.

Core claim

SynFlow is a motion-oriented synthetic data pipeline that generates diverse kinematic patterns across 4,000 LiDAR sequences without emphasizing sensor-specific realism. Models trained solely on the resulting SynFlow-4k dataset learn domain-invariant motion priors that transfer directly to real-world scenes, rivaling fully supervised in-domain baselines on nuScenes and surpassing existing methods on TruckScenes by 31.8 percent while enabling label-efficient adaptation that beats full real-data training with only 5 percent of the labels.

What carries the argument

The SynFlow data generation pipeline, which uses a motion-oriented strategy to synthesize varied kinematic patterns across many sequences rather than maximizing sensor fidelity.

If this is right

  • Scene flow models can reach strong real-world accuracy without any real labeled training data.
  • Label requirements for high performance drop by a factor of twenty when starting from the synthetic prior.
  • Motion estimation scales with simulation compute rather than annotation budgets.
  • Generalizable 3D motion understanding becomes feasible beyond the coverage of any single real dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Kinematic diversity in simulation appears more critical than photorealism for learning transferable motion priors.
  • The same motion-focused synthesis approach could extend to camera or radar scene flow with minimal changes.
  • Combining synthetic pre-training with self-supervised signals on real data may close any remaining gap without full labels.

Load-bearing premise

The synthetic motion patterns are representative enough of real-world vehicle and object kinematics that models learn priors without a remaining domain gap.

What would settle it

A new real-world LiDAR benchmark whose motion statistics fall outside the kinematic range covered by SynFlow-4k, on which zero-shot synthetic models fall substantially below supervised real-data baselines.

Figures

Figures reproduced from arXiv: 2604.09411 by Chenhan Jiang, Patric Jensfelt, Qingwen Zhang, Xiaomeng Zhu.

Figure 1
Figure 1. Figure 1: Scaling up LiDAR Scene Flow with Synthetic Data. We present SynFlow, a data gener￾ation pipeline leveraging the CARLA simulator to synthesize diverse, perfectly labeled LiDAR scene flow data (center). While real-world datasets are often constrained by high annotation costs and limited scenario diversity, SynFlow provides a scalable source of dense, noise-free supervi￾sion for learning robust motion priors.… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our SynFlow pipeline and dataset examples. Left: A CARLA world provides diverse road topologies; we construct a route bank using topology-aware coverage to ensure broad spatial exploration and execute rollouts under Traffic Management (TM). Middle: our procedural data engine instantiates an ego vehicle with configurable LiDAR, spawns surrounding agents with controllable policies, and runs synch… view at source ↗
Figure 3
Figure 3. Figure 3: Zero-shot scaling performance (1k–4k sequences). Evaluation on Aeva (a) and TruckScenes (b) using Dynamic Bucket-Normalized EPE (lower is better). Solid blue line in￾dicates the overall mean; dashed lines represent per-category breakdowns. mance improves consistently across both benchmarks, with the most significant gains observed between 1k and 2k. Beyond this point, accuracy begins to stabilize, suggest￾… view at source ↗
Figure 4
Figure 4. Figure 4: Dataset visualization. Top: SynFlow-4k samples under 64-beam (row 1) and 32-beam (row 2) configurations, spanning city, roundabout, highway, and merging scenarios. Per-point scene flow labels are rendered as colored vectors. Direction is encoded as hue, and magnitude as saturation. Bottom: representative samples from real-world datasets (nuScenes and TruckScene). scene elements. In contrast, after pre-trai… view at source ↗
read the original abstract

Reliable 3D dynamic perception requires models that can anticipate motion beyond predefined categories, yet progress is hindered by the scarcity of dense, high-quality motion annotations. While self-supervision on unlabeled real data offers a path forward, empirical evidence suggests that scaling unlabeled data fails to close the performance gap due to noisy proxy signals. In this paper, we propose a shift in paradigm: learning robust real-world motion priors entirely from scalable simulation. We introduce SynFlow, a data generation pipeline that generates large-scale synthetic dataset specifically designed for LiDAR scene flow. Unlike prior works that prioritize sensor-specific realism, SynFlow employs a motion-oriented strategy to synthesize diverse kinematic patterns across 4,000 sequences ($\sim$940k frames), termed SynFlow-4k. This represents a 34x scale-up in annotated volume over existing real-world benchmarks. Our experiments demonstrate that SynFlow-4k provides a highly domain-invariant motion prior. In a zero-shot regime, models trained exclusively on our synthetic data generalize across multiple real-world benchmarks, rivaling in-domain supervised baselines on nuScenes and outperforming state-of-the-art methods on TruckScenes by 31.8%. Furthermore, SynFlow-4k serves as a label-efficient foundation: fine-tuning with only 5% of real-world labels surpasses models trained from scratch on the full available budget. We open-source the pipeline and dataset to facilitate research in generalizable 3D motion estimation. More detail can be found at https://kin-zhang.github.io/SynFlow.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces SynFlow, a motion-oriented synthetic data generation pipeline for LiDAR scene flow estimation. It produces the SynFlow-4k dataset comprising 4,000 sequences (~940k frames), a 34x scale-up over existing real-world annotated benchmarks. The central claims are that models trained exclusively on this synthetic data achieve zero-shot generalization to real-world benchmarks, rivaling in-domain supervised baselines on nuScenes and outperforming prior state-of-the-art methods on TruckScenes by 31.8%, while also serving as a label-efficient foundation where fine-tuning on only 5% of real labels surpasses full real-data training from scratch. The pipeline and dataset are open-sourced.

Significance. If the zero-shot transfer results hold after verification of the evaluation protocol and motion distribution overlap, the work would be significant for addressing annotation scarcity in 3D dynamic perception by demonstrating that scalable synthetic motion patterns can induce domain-invariant priors. The 34x data scale-up, emphasis on kinematic diversity over sensor realism, and open-sourcing of the pipeline represent concrete strengths that could enable broader reproducibility and follow-on research in generalizable scene flow estimation.

major comments (2)
  1. [Experiments / zero-shot evaluation] The zero-shot generalization results (abstract and experiments section) are load-bearing for the domain-invariance claim. The manuscript lacks any quantitative comparison or ablation of kinematic distributions (e.g., velocity histograms, trajectory curvature, or multi-object interaction frequencies) between SynFlow-4k and the target real datasets (nuScenes, TruckScenes). Without this, it remains possible that performance gains exploit synthetic-specific regularities rather than learned general motion priors, undermining the 'motion-oriented strategy' premise.
  2. [SynFlow Pipeline] The pipeline description (likely §3) does not detail the range of object categories, rigid-body assumptions, or sampling strategy used to generate diverse kinematic patterns across the 4,000 sequences. This information is required to assess whether the synthetic motions are sufficiently representative for the reported zero-shot transfer and label-efficiency results.
minor comments (2)
  1. [Abstract] Specify the precise metric (e.g., mean endpoint error or accuracy threshold) underlying the '31.8%' improvement on TruckScenes and the 'rivaling' claim on nuScenes for reproducibility.
  2. [Abstract / Introduction] The '34x scale-up in annotated volume' claim should explicitly name the real-world benchmark(s) used for the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The major comments identify key areas where additional analysis and description can strengthen the manuscript's claims regarding domain invariance and pipeline reproducibility. We address each point below and have revised the manuscript to incorporate the requested information.

read point-by-point responses
  1. Referee: The zero-shot generalization results (abstract and experiments section) are load-bearing for the domain-invariance claim. The manuscript lacks any quantitative comparison or ablation of kinematic distributions (e.g., velocity histograms, trajectory curvature, or multi-object interaction frequencies) between SynFlow-4k and the target real datasets (nuScenes, TruckScenes). Without this, it remains possible that performance gains exploit synthetic-specific regularities rather than learned general motion priors, undermining the 'motion-oriented strategy' premise.

    Authors: We agree that a direct quantitative comparison of kinematic distributions is necessary to support the claim that performance arises from general motion priors rather than synthetic-specific artifacts. In the revised manuscript, we have added a new subsection (Section 4.3) with velocity histograms, trajectory curvature statistics, and multi-object interaction frequencies for SynFlow-4k versus nuScenes and TruckScenes. The analysis shows SynFlow-4k spans a wider velocity and curvature range while maintaining comparable interaction frequencies; combined with the 31.8% improvement on TruckScenes (which exhibits distinct motion statistics), this indicates the learned features are domain-invariant. We have also clarified in the text why the motion-oriented generation strategy reduces the likelihood of exploiting dataset-specific regularities. revision: yes

  2. Referee: The pipeline description (likely §3) does not detail the range of object categories, rigid-body assumptions, or sampling strategy used to generate diverse kinematic patterns across the 4,000 sequences. This information is required to assess whether the synthetic motions are sufficiently representative for the reported zero-shot transfer and label-efficiency results.

    Authors: We appreciate this observation and have substantially expanded the pipeline description in Section 3 of the revised manuscript. The updated text now specifies the full range of object categories (cars, trucks, buses, pedestrians, cyclists, and miscellaneous dynamic objects), the rigid-body assumptions applied to each category's motion (e.g., piecewise constant velocity with bounded perturbations for vehicles versus higher-variance trajectories for pedestrians), and the sampling strategy: initial velocities (0–40 m/s), accelerations, and yaw rates are drawn from broad distributions with added variance to promote diversity, while interactions are simulated via rule-based collision avoidance. These details enable readers to evaluate the kinematic representativeness underlying the zero-shot and label-efficient results. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on empirical zero-shot transfer to external real benchmarks

full rationale

The paper's core contribution is a synthetic data pipeline (SynFlow) whose value is demonstrated by training models exclusively on SynFlow-4k and measuring generalization on independent real-world datasets (nuScenes, TruckScenes). No equations, fitted parameters, or self-citations are invoked to derive the reported performance numbers; the results are obtained by direct evaluation on held-out external benchmarks. The motion-oriented synthesis strategy and domain-invariance claim are presented as empirical observations rather than tautological re-statements of the input data or prior self-citations. This is a standard non-circular empirical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the assumption that the simulated kinematic patterns are representative enough to serve as a domain-invariant prior; no free parameters or new physical entities are introduced beyond the data-generation pipeline itself.

axioms (1)
  • domain assumption Synthetic kinematic patterns generated by the pipeline are statistically close enough to real-world motion distributions to support zero-shot transfer.
    Invoked when claiming that models trained only on SynFlow-4k rival supervised real-data baselines.

pith-pipeline@v0.9.0 · 5581 in / 1268 out tokens · 48561 ms · 2026-05-10T16:41:12.457339+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 9 canonical work pages

  1. [1]

    In: CVPR (2020)

    Caesar, H., Bankiti, V ., Lang, A.H., V ora, S., Liong, V .E., Xu, Q., Krishnan, A., Pan, Y ., Baldan, G., Beijbom, O.: nuscenes: A multimodal dataset for autonomous driving. In: CVPR (2020)

  2. [2]

    In: 2023 IEEE International Conference on Robotics and Automation (ICRA)

    Cai, X., Jiang, W., Xu, R., Zhao, W., Ma, J., Liu, S., Li, Y .: Analyzing infrastructure lidar placement with realistic lidar simulation library. In: 2023 IEEE International Conference on Robotics and Automation (ICRA). pp. 5581–5587. IEEE (2023)

  3. [3]

    Semantic generative augmentations for few-shot counting, in: IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2024, Waikoloa, HI, USA, January 3-8, 2024, IEEE

    Chodosh, N., Ramanan, D., Lucey, S.: Re-evaluating lidar scene flow. In: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (W ACV). pp. 5993–6003 (2024). https://doi.org/10.1109/WACV57701.2024.00590

  4. [4]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    Choy, C., Gwak, J., Savarese, S.: 4d spatio-temporal convnets: Minkowski convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3075–3084 (2019)

  5. [5]

    In: Conference on robot learning

    Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V .: Carla: An open urban driving simulator. In: Conference on robot learning. pp. 1–16. PMLR (2017)

  6. [6]

    Advances in Neural Information Processing Systems37, 62062–62082 (2024)

    Fent, F., Kuttenreich, F., Ruch, F., Rizwin, F., Juergens, S., Lechermann, L., Nissler, C., Perl, A., V oll, U., Yan, M., et al.: Man truckscenes: A multimodal dataset for autonomous trucking in diverse conditions. Advances in Neural Information Processing Systems37, 62062–62082 (2024)

  7. [7]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Hoffmann, D.T., Raza, S.H., Jiang, H., Tananaev, D., Klingenhoefer, S., Meinke, M.: Floxels: Fast unsupervised voxel based scene flow estimation. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 22328–22337 (2025)

  8. [8]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

    Jiang, C., Wang, G., Liu, J., Wang, H., Ma, Z., Liu, Z., Liang, Z., Shan, Y ., Du, D.: 3dsfla- belling: Boosting 3d scene flow estimation by pseudo auto-labelling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

  9. [9]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Jiang, H., Xu, Z., Xie, D., Chen, Z., Jin, H., Luan, F., Shu, Z., Zhang, K., Bi, S., Sun, X., et al.: Megasynth: Scaling up 3d scene reconstruction with synthesized data. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16441–16452 (2025)

  10. [10]

    In: Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision

    Jiang, W., Xiang, H., Cai, X., Xu, R., Ma, J., Li, Y ., Lee, G.H., Liu, S.: Optimizing the placement of roadside lidars for autonomous driving. In: Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision. pp. 18381–18390 (2023)

  11. [11]

    Khatri, I., Vedder, K., Peri, N., Ramanan, D., Hays, J.: I can’t believe it’s not scene flow! In: European Conference on Computer Vision. pp. 242–257. Springer (2024)

  12. [12]

    IEEE Robotics and Automation Letters pp

    Kim, J., Woo, J., Shin, U., Oh, J., Im, S.: Flow4D: Leveraging 4d voxel network for lidar scene flow estimation. IEEE Robotics and Automation Letters pp. 1–8 (2025).https: //doi.org/10.1109/LRA.2025.3542327 16 Q Zhang et al

  13. [13]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Kloukiniotis, A., Papandreou, A., Anagnostopoulos, C., Lalos, A., Kapsalas, P., Nguyen, D.V ., Moustakas, K.: Carlascenes: A synthetic dataset for odometry in autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4520–4528 (2022)

  14. [14]

    Li, S., Zhang, Q., Khatri, I., Vedder, K., Ramanan, D., Peri, N.: UniFlow: Towards zero-shot lidar scene flow for autonomous vehicles via cross-domain generalization (2025),https: //arxiv.org/abs/2511.18254

  15. [15]

    In: CVPR (2024)

    Lin, Y ., Caesar, H.: ICP-Flow: Lidar scene flow estimation with icp. In: CVPR (2024)

  16. [16]

    In: Proceedings of the Computer Vision and Pattern Recognition Con- ference

    Lin, Y ., Wang, S., Nan, L., Kooij, J., Caesar, H.: V oteflow: Enforcing local rigidity in self- supervised scene flow. In: Proceedings of the Computer Vision and Pattern Recognition Con- ference. pp. 17155–17164 (2025)

  17. [17]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition

    Liu, J., Wang, G., Ye, W., Jiang, C., Han, J., Liu, Z., Zhang, G., Du, D., Wang, H.: Difflow3d: Toward robust uncertainty-aware scene flow estimation with iterative diffusion-based refine- ment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition. pp. 15109–15119 (2024)

  18. [18]

    In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2016),http://lmb.informatik.uni-freiburg.de/Publications/2016/ MIFDB16, arXiv:1512.02134

    Mayer, N., Ilg, E., Häusser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2016),http://lmb.informatik.uni-freiburg.de/Publications/2016/ MIFDB16, arXiv:1512.02134

  19. [19]

    Narasimhan, G.N., Vhavle, H., Vishvanatha, K.B., Reuther, J.: Aevascenes: A dataset and benchmark for fmcw lidar perception (2025),https://scenes.aeva.com/

  20. [20]

    NVIDIA: Isaac Sim (2024),https://github.com/isaac-sim/IsaacSim

  21. [21]

    In: European conference on computer vision

    Pang, Z., Li, Z., Wang, N.: Simpletrack: Understanding and rethinking 3d multi-object track- ing. In: European conference on computer vision. pp. 680–696. Springer (2022)

  22. [22]

    In: CVPR (2025)

    Ren, X., Shen, T., Huang, J., Ling, H., Lu, Y ., Nimier-David, M., Müller, T., Keller, A., Fi- dler, S., Gao, J.: Gen3c: 3d-informed world-consistent video generation with precise camera control. In: CVPR (2025)

  23. [23]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Sun, T., Segu, M., Postels, J., Wang, Y ., Van Gool, L., Schiele, B., Tombari, F., Yu, F.: Shift: a synthetic driving dataset for continuous multi-task domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 21371–21382 (2022)

  24. [24]

    com / HDFGroup/hdf5

    The HDF Group: Hierarchical Data Format, version 5,https : / / github . com / HDFGroup/hdf5

  25. [25]

    In: ECCV (2024)

    Van Hoorick, B., Wu, R., Ozguroglu, E., Sargent, K., Liu, R., Tokmakov, P., Dave, A., Zheng, C., V ondrick, C.: Generative camera dolly: Extreme monocular dynamic novel view synthe- sis. In: ECCV (2024)

  26. [26]

    International Conference on Learning Representations (ICLR) (2024)

    Vedder, K., Peri, N., Chodosh, N., Khatri, I., Eaton, E., Jayaraman, D., Ramanan, Y .L.D., Hays, J.: ZeroFlow: Fast Zero Label Scene Flow via Distillation. International Conference on Learning Representations (ICLR) (2024)

  27. [27]

    Vedder, K., Peri, N., Khatri, I., Li, S., Eaton, E., Kocamaz, M.K., Wang, Y ., Yu, Z., Ramanan, D., Pehserl, J.: Neural eulerian scene flow fields (2025),https://openreview.net/ forum?id=0CieWy9ONY

  28. [28]

    In: IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Wang, J., Chen, M., Karaev, N., Vedaldi, A., Rupprecht, C., Novotný, D.: VGGT: visual geometry grounded transformer. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5294–5306 (2025)

  29. [29]

    arXiv preprint arXiv:2311.15615 (2023)

    Wang, Z., Chen, F., Lertniphonphan, K., Chen, S., Bao, J., Zheng, P., Zhang, J., Huang, K., Zhang, T.: Technical report for argoverse challenges on unified sensor-based detection, tracking, and forecasting. arXiv preprint arXiv:2311.15615 (2023)

  30. [30]

    Wilson, B., Qi, W., Agarwal, T., Lambert, J., Singh, J., et al.: Argoverse 2: Next genera- tion datasets for self-driving perception and forecasting. In: Proceedings of the Neural In- SynFlow: Scaling Up LiDAR Scene Flow Estimation with Synthetic Data 17 formation Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021) (2021)

  31. [31]

    Advances in Neural Information Processing Systems37, 53285–53316 (2024)

    Xie, D., Bi, S., Shu, Z., Zhang, K., Xu, Z., Zhou, Y ., Pirk, S., Kaufman, A., Sun, X., Tan, H.: Lrm-zero: Training large reconstruction models with synthesized data. Advances in Neural Information Processing Systems37, 53285–53316 (2024)

  32. [32]

    Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, and Wei Zhan

    Yang, D., Cai, X., Liu, Z., Jiang, W., Zhang, B., Yan, G., Gao, X., Liu, S., Shi, B.: Re- alistic rainy weather simulation for lidars in carla simulator. In: 2024 IEEE/RSJ Interna- tional Conference on Intelligent Robots and Systems (IROS). pp. 951–957 (2024).https: //doi.org/10.1109/IROS58592.2024.10802036

  33. [33]

    In: The Twelfth International Confer- ence on Learning Representations (2024),https://openreview.net/forum?id= 1d2cLKeNgY

    Zhang, B., Cai, X., Yuan, J., Yang, D., Guo, J., Yan, X., Xia, R., Shi, B., Dou, M., Chen, T., Liu, S., Yan, J., Qiao, Y .: ResimAD: Zero-shot 3d domain transfer for autonomous driv- ing with source reconstruction and target simulation. In: The Twelfth International Confer- ence on Learning Representations (2024),https://openreview.net/forum?id= 1d2cLKeNgY

  34. [34]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2026)

    Zhang, Q., Jiang, C., Zhu, X., Miao, Y ., Zhang, Y ., Andersson, O., Jensfelt, P.: TeFlow: Enabling multi-frame supervision for self-supervised feed-forward scene flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2026)

  35. [35]

    IEEE Transactions on Robotics41, 5896–5911 (2025).https://doi.org/10.1109/TRO.2025.3619042

    Zhang, Q., Khoche, A., Yang, Y ., Ling, L., Mansouri, S.S., Andersson, O., Jensfelt, P.: HiMo: High-speed objects motion compensation in point cloud. IEEE Transactions on Robotics41, 5896–5911 (2025).https://doi.org/10.1109/TRO.2025.3619042

  36. [36]

    In: IEEE International Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13-17, 2024

    Zhang, Q., Yang, Y ., Fang, H., Geng, R., Jensfelt, P.: DeFlow: Decoder of scene flow network in autonomous driving. In: 2024 IEEE International Conference on Robotics and Automation (ICRA). pp. 2105–2111 (2024).https://doi.org/10.1109/ICRA57147.2024. 10610278

  37. [37]

    In: European Conference on Computer Vision (ECCV)

    Zhang, Q., Yang, Y ., Li, P., Andersson, O., Jensfelt, P.: SeFlow: A self-supervised scene flow method in autonomous driving. In: European Conference on Computer Vision (ECCV). p. 353–369. Springer (2024).https://doi.org/10.1007/978-3-031-73232- 4_20

  38. [38]

    In: The Thirty-ninth Annual Conference on Neu- ral Information Processing Systems (2025)

    Zhang, Q., Zhu, X., Zhang, Y ., Cai, Y ., Andersson, O., Jensfelt, P.: DeltaFlow: An efficient multi-frame scene flow estimation method. In: The Thirty-ninth Annual Conference on Neu- ral Information Processing Systems (2025)

  39. [39]

    Advances in Neural Information Processing Systems36(2024)

    Zhang, Y ., Edstedt, J., Wandt, B., Forssén, P.E., Magnusson, M., Felsberg, M.: GMSF: Global matching scene flow. Advances in Neural Information Processing Systems36(2024)