arxiv: 2604.04737 · v2 · submitted 2026-04-06 · 📡 eess.SP

Recognition: 2 theorem links

· Lean Theorem

LEAN-3D: Low-latency Hierarchical Point Cloud Codec for Mobile 3D Streaming

Yuchen Gao , Qi Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:00 UTC · model grok-4.3

classification 📡 eess.SP

keywords point cloud compressionlow-latency streamingmobile edge computinglearned compressionhierarchical codingoccupancy model3D streamingedge codec

0 comments

The pith

A hybrid point cloud codec limits learned modeling to uncertain shallow levels and uses deterministic coding deeper to cut mobile streaming latency by 3-5x.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops LEAN-3D to overcome the high runtime cost of learned point cloud compression on mobile devices, which blocks low-latency 3D streaming under tight compute and power limits. It restricts the learned occupancy model to shallow levels of the hierarchy where uncertainty is highest and applies a deterministic scheme for the deeper near-unary regime. The design also eliminates cross-platform decoding failures caused by numerical inconsistencies in entropy coding. Experiments on an NVIDIA Jetson Orin Nano show 3-5x lower latency and up to 5.1x energy savings while keeping rate-distortion performance competitive. A reader would care because this approach could finally enable practical immersive 3D communication on everyday edge hardware.

Core claim

LEAN-3D is a compute-aware point cloud codec that places a lightweight learned occupancy model at the shallow levels of a sparse occupancy hierarchy and develops a lightweight deterministic coding scheme for the deep hierarchy tailored to the near-unary regime. The complete encoder/decoder pipeline resolves numerical inconsistencies in lossless entropy decoding across heterogeneous platforms. When evaluated on an NVIDIA Jetson Orin Nano edge device and desktop host, it delivers 3-5x latency reduction across datasets, up to 5.1x lower total edge-side energy consumption, and lower sustained end-to-end latency under bandwidth-limited streaming.

What carries the argument

The sparse occupancy hierarchy that applies lightweight learned occupancy modeling only at shallow uncertain levels and deterministic coding in the deep hierarchy.

If this is right

Mobile 3D streaming can operate with sustained low end-to-end latency even when bandwidth is limited.
Edge devices consume substantially less energy during point cloud encoding and decoding.
Learned codecs become viable for real-time immersive applications without requiring high-end hardware.
Cross-platform deployment avoids the numerical inconsistencies that previously caused decoding failures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The selective placement of learned components could extend to other hierarchical media formats such as meshes or voxel grids on edge platforms.
Reducing neural inference depth might allow integration with existing mobile 3D pipelines that currently rely on traditional codecs.
Further tests on a wider range of mobile chipsets would clarify whether the energy and latency gains hold under diverse thermal and power constraints.

Load-bearing premise

That the lightweight learned model at shallow levels keeps overall coding efficiency competitive without creating new failure modes when the deterministic deep coding runs on varied hardware.

What would settle it

A side-by-side rate-distortion comparison on the same datasets where LEAN-3D shows worse compression efficiency than prior learned codecs, or a cross-platform test revealing persistent decoding errors or no latency gain on additional mobile devices.

Figures

Figures reproduced from arXiv: 2604.04737 by Qi Zhang, Yuchen Gao.

**Figure 2.** Figure 2: Comparison of four geometry-coding paradigms for point cloud compression. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the proposed LEAN-3D PCC framework. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: 2D toy illustration of BPA: occupied child coordinates at level [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: 2D toy illustration of BCE: given a parent coordinate [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: Architecture comparison between the teacher (RENO) model and [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Packet structure of one encoded frame. The top row shows the fixed [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Per-frame encoding and decoding time comparison of RENO and [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Component-wise latency attribution of RENO and LEAN-3D on Jetson Orin Nano under different quantization settings ( [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative reconstruction comparison under different quantization settings ( [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: System-level streaming performance under [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

**Figure 13.** Figure 13: Compressed size breakdown of RENO and LEAN-3D under different [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗

**Figure 16.** Figure 16: Decoding failure of cross-platform deployment of RENO on [PITH_FULL_IMAGE:figures/full_fig_p012_16.png] view at source ↗

**Figure 17.** Figure 17: The heterogeneous results show that LEAN-3D accelerates encoding on the resource-constrained edge while also delivering lower and more stable decoding latency on the host. Together with the failure analysis above, this confirms that both the proposed bit-exact entropy coding and compute-aware split are necessary for a deployable cross-platform prototype setup. H. Energy Consumption Beyond runtime, energy… view at source ↗

**Figure 17.** Figure 17: Edge–host codec latency on KITTIDetection under heterogeneous [PITH_FULL_IMAGE:figures/full_fig_p013_17.png] view at source ↗

read the original abstract

We aim to make learned point cloud compression deployable for low-latency streaming on mobile systems. While learned point cloud compression has shown strong coding efficiency, practical deployment on mobile platforms remains challenging because neural inference and entropy coding still incur substantial runtime overhead. This issue is critical for immersive 3D communication, where dense geometry must be delivered under tight end-to-end (E2E) latency and compute constraints. In this paper, we present LEAN-3D, a compute-aware point cloud codec for low-latency streaming. LEAN-3D designs a lightweight learned occupancy model at the shallow levels of a sparse occupancy hierarchy, where structural uncertainty is highest, and develops a lightweight deterministic coding scheme for the deep hierarchy tailored to the near-unary regime. We implement the complete encoder/decoder pipeline and evaluate it on an NVIDIA Jetson Orin Nano edge device and a desktop host. In addition, LEAN-3D addresses the decoding failures observed in cross-platform deployment of learned codecs. Such failures arise from numerical inconsistencies in lossless entropy decoding across heterogeneous platforms. Experiments show that LEAN-3D achieves 3-5x latency reduction across datasets, reduces total edge-side energy consumption by up to 5.1x, and delivers lower sustained E2E latency under bandwidth-limited streaming. These results bring learned point cloud compression closer to deployable mobile 3D streaming.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LEAN-3D's shallow-learned plus deep-deterministic hierarchy split plus cross-platform fix delivers measurable latency and energy wins on real mobile hardware.

read the letter

The main thing here is the targeted split: learned occupancy only at the shallow hierarchy levels where uncertainty peaks, then deterministic coding for the deep near-unary regime, combined with a fix for numerical inconsistencies that break lossless decoding across platforms. That combination is presented as a practical engineering choice rather than a broad new theory, and the paper follows through with a full encoder/decoder pipeline tested on Jetson Orin Nano edge hardware plus a desktop host. The reported outcomes are 3-5x latency reduction across datasets, up to 5.1x lower edge energy use, and better sustained end-to-end latency under bandwidth limits, with rate-distortion curves and cross-dataset numbers included. These are the kind of concrete deployment metrics that move the work past pure research prototypes. The implementation details and hardware measurements give the claims more weight than abstract-only papers in this space. One soft spot is that the gains are measured on specific datasets and the Jetson setup, so the magnitude could shift with other hardware or content; the paper does not appear to overclaim universality. No load-bearing inconsistencies show up in the experimental design or results. This paper is aimed at people working on deployable 3D streaming for AR/VR and mobile immersive communication. It has enough real implementation and falsifiable numbers to deserve a serious referee rather than a desk reject.

Referee Report

2 major / 3 minor

Summary. The paper introduces LEAN-3D, a compute-aware point cloud codec for low-latency mobile 3D streaming. It combines a lightweight learned occupancy model at shallow levels of a sparse occupancy hierarchy (where structural uncertainty is highest) with a deterministic coding scheme for the deep hierarchy in the near-unary regime. The authors implement the full encoder/decoder pipeline, evaluate it on an NVIDIA Jetson Orin Nano edge device paired with a desktop host, and address cross-platform decoding failures arising from numerical inconsistencies in lossless entropy decoding. Experiments report 3-5x latency reduction across datasets, up to 5.1x reduction in total edge-side energy consumption, and lower sustained end-to-end latency under bandwidth-limited streaming.

Significance. If the reported measurements hold, the work meaningfully advances practical deployment of learned point cloud compression for immersive applications. By demonstrating concrete latency and energy gains on real mobile hardware while maintaining competitive rate-distortion behavior through the hybrid learned-deterministic design, and by providing a concrete fix for cross-platform entropy decoding inconsistencies, the paper supplies an engineering pathway that brings learned codecs closer to real-time 3D streaming constraints.

major comments (2)

[Section 4, Figure 5, Table 2] Section 4 (Experiments), Figure 5 and Table 2: the rate-distortion curves and BD-rate numbers are presented relative to a small set of baselines, but the paper does not report error bars or multiple random seeds for the learned occupancy model; this leaves open whether the claimed preservation of coding efficiency is statistically robust across the reported datasets.
[Section 3.2] Section 3.2 (Deterministic Deep Coding): the claim that the deterministic scheme introduces no new failure modes on heterogeneous platforms is supported only by the cross-platform consistency fix; a direct ablation showing the impact on reconstruction quality when the deterministic path is replaced by a learned alternative would strengthen the central efficiency claim.

minor comments (3)

[Section 2, Algorithm 1] The notation for the sparse occupancy hierarchy levels (shallow vs. deep) is introduced in Section 2 but used inconsistently in the pseudocode of Algorithm 1; a single consistent definition would improve readability.
[Section 4.3] Energy measurements in Section 4.3 are reported as total edge-side consumption; clarifying whether this includes only inference or also I/O and entropy coding overhead would make the 5.1x figure easier to interpret.
[Abstract, Section 1] The abstract and introduction cite '3-5x latency reduction' without specifying the exact baseline codec and platform configuration for each number; adding a short table mapping each multiplier to its reference would aid quick assessment.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for minor revision. We address each major comment below, providing clarifications and indicating revisions made to strengthen the manuscript where appropriate.

read point-by-point responses

Referee: Section 4 (Experiments), Figure 5 and Table 2: the rate-distortion curves and BD-rate numbers are presented relative to a small set of baselines, but the paper does not report error bars or multiple random seeds for the learned occupancy model; this leaves open whether the claimed preservation of coding efficiency is statistically robust across the reported datasets.

Authors: We agree that statistical robustness across training runs would better support the rate-distortion claims. The learned occupancy model was trained with a fixed random seed in the original experiments to ensure reproducibility. To address this concern, we have re-trained the model using three different random seeds and recomputed the rate-distortion curves and BD-rate values. The observed variations are small (under 0.4% in BD-rate across datasets), confirming that the hybrid design preserves coding efficiency consistently. We have updated Figure 5 and Table 2 to include error bars in the revised manuscript. revision: yes
Referee: Section 3.2 (Deterministic Deep Coding): the claim that the deterministic scheme introduces no new failure modes on heterogeneous platforms is supported only by the cross-platform consistency fix; a direct ablation showing the impact on reconstruction quality when the deterministic path is replaced by a learned alternative would strengthen the central efficiency claim.

Authors: We appreciate the suggestion for an ablation. However, replacing the deterministic coding with a learned alternative at deep levels would require implementing and training additional neural models for the near-unary regime, which would negate the latency and energy benefits central to LEAN-3D. The deterministic scheme uses only integer arithmetic and is lossless by design, introducing no reconstruction quality degradation. We have expanded the discussion in Section 3.2 to explicitly explain that the scheme avoids floating-point operations, thereby eliminating numerical inconsistencies on heterogeneous platforms beyond what the cross-platform fix already addresses. This clarification reinforces the efficiency rationale without an impractical ablation. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an engineering system (LEAN-3D) that combines a lightweight learned occupancy model at shallow hierarchy levels with a deterministic coding scheme at deeper levels, followed by full pipeline implementation and hardware measurements on Jetson Orin Nano and desktop. No equations, fitted parameters, or first-principles derivations are presented whose outputs reduce by construction to the inputs. Performance claims (latency, energy, E2E latency) rest on direct experimental evaluation across datasets rather than any self-referential prediction or self-citation chain. The design is presented as a practical combination of existing techniques with platform-specific fixes, making the central claims self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The approach builds on prior learned point cloud codecs without introducing new postulated entities.

pith-pipeline@v0.9.0 · 5550 in / 1171 out tokens · 49698 ms · 2026-05-10T19:00:25.303824+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We observe a shift in occupancy statistics across the hierarchy... unary fraction counts nodes with popcount(O)=1... Ds as the first level at which this ratio exceeds 60%
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

8-bit occupancy code O(d)(v) ∈ {0,...,255}... popcount(O(d)(v)) ≈ 1 for large d

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 7 canonical work pages · 1 internal anchor

[1]

Emerging MPEG standards for point cloud compression,

S. Schwarz, M. Preda, V . Baroncini, M. Budagavi, P. Cesar, P. A. Chou, R. A. Cohen, M. Krivoku ´ca, S. Lasserre, Z. Liet al., “Emerging MPEG standards for point cloud compression,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 1, pp. 133–148, 2019

2019
[2]

An overview of ongoing point cloud compression stan- dardization activities: Video-based (V-PCC) and geometry-based (G- PCC),

D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, “An overview of ongoing point cloud compression stan- dardization activities: Video-based (V-PCC) and geometry-based (G- PCC),”APSIPA Transactions on Signal and Information Processing, vol. 9, p. e13, 2020

2020
[3]

What’s new in point cloud compres- sion?

C. Cao, M. Preda, and T. Zaharia, “What’s new in point cloud compres- sion?”Global J. Eng. Sci., vol. 4, no. 5, 2020

2020
[4]

The tactile internet,

ITU-T Technology Watch, “The tactile internet,” Aug. 2014

2014
[5]

The tactile internet: Applications and challenges,

G. P. Fettweis, “The tactile internet: Applications and challenges,”IEEE Vehicular Technology Magazine, 2014

2014
[6]

TR 38.913: Study on scenarios and requirements for next generation access technologies,

3GPP, “TR 38.913: Study on scenarios and requirements for next generation access technologies,” 3rd Generation Partnership Project, Tech. Rep., 2020

2020
[7]

Survey on deep learning-based point cloud compression,

M. Quach, J. Pang, D. Tian, G. Valenzise, and F. Dufaux, “Survey on deep learning-based point cloud compression,”Frontiers in Signal Processing, vol. 2, p. 846972, 2022

2022
[8]

Deep learning- based point cloud compression: An in-depth survey and benchmark,

W. Gao, L. Xie, S. Fan, G. Li, S. Liu, and W. Gao, “Deep learning- based point cloud compression: An in-depth survey and benchmark,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

2025
[9]

Real-time compression of point cloud streams,

J. Kammerl, N. Blodow, R. B. Rusu, S. Gedikli, M. Beetz, and E. Steinbach, “Real-time compression of point cloud streams,” in2012 IEEE International Conference on Robotics and Automation (ICRA), 2012, pp. 778–785

2012
[10]

Reno: Real-time neural compression for 3d lidar point clouds,

K. You, T. Chen, D. Ding, M. S. Asif, and Z. Ma, “Reno: Real-time neural compression for 3d lidar point clouds,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

2025
[11]

RFC 9330: Low latency, low loss, scalable throughput (L4S) internet service: Architecture,

IETF, “RFC 9330: Low latency, low loss, scalable throughput (L4S) internet service: Architecture,” https://www.rfc-editor.org/rfc/rfc9330. html, 2023

2023
[12]

Real-time point cloud data transmission via l4s for 5g-edge- assisted robotics,

G. Damigos, A. S. Seisa, N. Stathoulopoulos, S. Sandberg, and G. Niko- lakopoulos, “Real-time point cloud data transmission via l4s for 5g-edge- assisted robotics,”arXiv preprint arXiv:2511.15677, 2025

work page arXiv 2025
[13]

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding

J. Duda, “Asymmetric numeral systems: entropy coding combining speed of huffman coding with compression rate of arithmetic coding,” arXiv preprint arXiv:1311.2540, 2013

work page Pith review arXiv 2013
[14]

Interleaved entropy coders,

F. Giesen, “Interleaved entropy coders,” https://fgiesen.wordpress.com/ 2014/02/18/rans-in-practice/, 2014

2014
[15]

Efficient storage and retrieval by content and address of static files,

P. Elias, “Efficient storage and retrieval by content and address of static files,”Journal of the ACM, vol. 21, no. 2, pp. 246–260, 1974

1974
[16]

Quasi-succinct indices,

S. Vigna, “Quasi-succinct indices,” inProceedings of the sixth ACM international conference on Web search and data mining, 2013, pp. 83– 92

2013
[17]

Information technology — coded representation of immersive media — part 9: Geometry-based point cloud compression (G-PCC), ISO/IEC 23090-9:2023,

ISO/IEC, “Information technology — coded representation of immersive media — part 9: Geometry-based point cloud compression (G-PCC), ISO/IEC 23090-9:2023,” 2023, international Standard

2023
[18]

——, “Information technology — coded representation of immersive media — part 5: Visual volumetric video-based coding (V3C) and video- based point cloud compression (V-PCC), ISO/IEC 23090-5:2021,” 2021, international Standard

2021
[19]

Draco: 3d data compression (meshes and point clouds),

Google, “Draco: 3d data compression (meshes and point clouds),” https: //github.com/google/draco, accessed in April, 2026, software repository

2026
[20]

Octree-based point-cloud compression,

R. Schnabel and R. Klein, “Octree-based point-cloud compression,” in Proceedings of the Symposium on Point-Based Graphics, 2006

2006
[21]

Real-time point cloud compression,

T. Golla and R. Klein, “Real-time point cloud compression,” in2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 5087–5092

2015
[22]

Compressing ros sensor and geometry messages with draco,

T. Wiemann, F. Igelbrink, S. P ¨utz, M. K. Piening, S. Schupp, S. Hin- derink, J. Vana, and J. Hertzberg, “Compressing ros sensor and geometry messages with draco,” in2019 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), 2019, pp. 243–248

2019
[23]

Content-aware adaptive point cloud delivery,

Y . Alkhalili, T. Gruczyk, T. Meuser, A. F. Anta, A. Khalil, and A. Mau- the, “Content-aware adaptive point cloud delivery,” in2022 IEEE Eighth International Conference on Multimedia Big Data (BigMM), 2022, pp. 13–20

2022
[24]

A comprehensive study and comparison of core technologies for mpeg 3d point cloud compression,

H. Liu, H. Yuan, Q. Liu, J. Hou, and J. Liu, “A comprehensive study and comparison of core technologies for mpeg 3d point cloud compression,” IEEE Transactions on Broadcasting, vol. 66, no. 3, pp. 701–717, 2020

2020
[25]

Learning- based lossless compression of 3d point cloud geometry,

D. T. Nguyen, M. Quach, G. Valenzise, and P. Duhamel, “Learning- based lossless compression of 3d point cloud geometry,” pp. 4220–4224, 2021

2021
[26]

8i vox- elized full bodies – a voxelized point cloud dataset,

E. d’Eon, B. Harrison, T. Myers, and P. A. Chou, “8i vox- elized full bodies – a voxelized point cloud dataset,” 2017, iSO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006, Geneva, January 2017

2017
[27]

Octattention: Octree-based large-scale contexts model for point cloud compression,

C. Fu, G. Li, R. Song, W. Gao, and S. Liu, “Octattention: Octree-based large-scale contexts model for point cloud compression,” inProceedings of the AAAI Conference on Artificial Intelligence, 2022

2022
[28]

Semantickitti: A dataset for semantic scene understanding of lidar sequences,

J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss, and J. Gall, “Semantickitti: A dataset for semantic scene understanding of lidar sequences,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9297–9307

2019
[29]

Efficient hierarchical entropy model for learned point cloud compression,

R. Song, G. Li, C. Fu, W. Gao, and S. Liu, “Efficient hierarchical entropy model for learned point cloud compression,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2023
[30]

Multiscale point cloud geometry compression,

J. Wang, D. Ding, Z. Li, and Z. Ma, “Multiscale point cloud geometry compression,” in2021 Data Compression Conference (DCC). IEEE, 2021, pp. 73–82

2021
[31]

Are we ready for autonomous driving? the kitti vision benchmark suite,

A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 3354–3361. 15

2012
[32]

Octsqueeze: Octree-structured entropy model for lidar compression,

L. Huang, S. Wang, K. Wong, J. Liu, and R. Urtasun, “Octsqueeze: Octree-structured entropy model for lidar compression,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1313–1323

2020
[33]

A review of 6-dof volumetric video communication systems: Ai empowered perspectives,

Y . Huang, Y . Gao, Y . Zhu, Z. Chen, R. Sidhu, X. Qiao, Q. Zhang, and X. Su, “A review of 6-dof volumetric video communication systems: Ai empowered perspectives,”IEEE Communications Surveys & Tutorials, 2026, to appear

2026
[34]

Learning convolutional trans- forms for lossy point cloud geometry compression,

M. Quach, G. Valenzise, and F. Dufaux, “Learning convolutional trans- forms for lossy point cloud geometry compression,” in2019 IEEE international conference on image processing (ICIP). IEEE, 2019, pp. 4320–4324

2019
[35]

Learned point cloud geometry compression,

J. Wang, H. Zhu, Z. Ma, T. Chen, H. Liu, and Q. Shen, “Learned point cloud geometry compression,”arXiv preprint arXiv:1909.12037, 2019

work page arXiv 1909
[36]

Multiscale deep context modeling for lossless point cloud geometry compression,

D. T. Nguyen, M. Quach, G. Valenzise, and P. Duhamel, “Multiscale deep context modeling for lossless point cloud geometry compression,” in2021 IEEE International Conference on Multimedia & Expo Work- shops (ICMEW). IEEE, 2021, pp. 1–6

2021
[37]

Ecm-opcc: Efficient context model for octree-based point cloud compression,

Y . Liuet al., “Ecm-opcc: Efficient context model for octree-based point cloud compression,”arXiv preprint arXiv:2211.10916, 2022

work page arXiv 2022
[38]

Sparse tensor- based multiscale representation for point cloud geometry compression,

J. Wang, D. Ding, Z. Li, X. Feng, C. Cao, and Z. Ma, “Sparse tensor- based multiscale representation for point cloud geometry compression,” arXiv preprint arXiv:2111.10633, 2021

work page arXiv 2021
[39]

Hint: Hierarchical inter-frame correlation for one-shot point cloud sequence compression,

Y . Gao and Q. Zhang, “Hint: Hierarchical inter-frame correlation for one-shot point cloud sequence compression,”arXiv preprint arXiv:2509.14859, 2025

work page arXiv 2025
[40]

Practical full resolution learned lossless image compression,

F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. Van Gool, “Practical full resolution learned lossless image compression,” inPro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

2019
[41]

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. K. Pontes, D. Ramanan, P. Carr, and J. Hays, “Argoverse 2: Next generation datasets for self-driving perception and forecasting,”arXiv preprint arXiv:2301.00493, 2023. Yuchen Gaois a Marie Skłodowska-Curie Ac- tions Doctoral Candidate (MSCA DC) with the De...

work page internal anchor Pith review arXiv 2023