Contactless 3D Human Body Measurement Using Depth Cameras for Smart Health Monitoring

Jinghao Yang; Juan Lopez Alvarenga; Lois Akosua Serwaa; Martha Asare; Xuan Wang

arxiv: 2606.11578 · v1 · pith:NBXD36OYnew · submitted 2026-06-10 · 💻 cs.CV

Contactless 3D Human Body Measurement Using Depth Cameras for Smart Health Monitoring

Martha Asare , Xuan Wang , Juan Lopez Alvarenga , Lois Akosua Serwaa , Jinghao Yang This is my paper

Pith reviewed 2026-06-27 10:42 UTC · model grok-4.3

classification 💻 cs.CV

keywords contactless measurementdepth camera3D point cloudbody measurementsvolume estimationsmart health monitoringremote assessment

0 comments

The pith

A depth camera framework extracts human body measurements like height and volume from a single 3D point cloud capture without physical contact.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method that captures RGB images, depth maps, and point clouds with one depth camera to compute body dimensions remotely. It segments the body from the background, selects landmarks on the point cloud, projects distances onto RGB images, and estimates volume through voxel occupancy plus surface area through mesh reconstruction. The approach targets health monitoring scenarios where contact or on-site staff would limit reach. A sympathetic reader would care because it removes the need for physical interaction while still producing geometric estimates from one snapshot.

Core claim

Processing a single depth capture's point cloud through spatial filtering and landmark selection produces linear measurements such as height and arm span, while voxel-based occupancy and mesh reconstruction yield approximate body volume and visible surface area, all obtained without touching the subject.

What carries the argument

The point cloud processing pipeline that segments the body, selects landmarks on the 3D data, projects measurements using camera intrinsic parameters, and computes volume and area from voxels and meshes.

If this is right

Body measurements become obtainable in remote or home settings without trained personnel present.
Volume and surface area estimates can be added to standard linear checks from one capture.
The single-capture method supplies a base for building real-time depth-sensing health systems.
Integration with generative AI models for personalized monitoring becomes feasible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the pipeline to video streams could support ongoing rather than snapshot monitoring.
Accuracy would need explicit error metrics against ground truth before clinical deployment.
The same segmentation steps might apply to other depth sensors beyond the one tested.

Load-bearing premise

The distances and volumes calculated after filtering and landmark selection on the point cloud match the subject's actual physical dimensions.

What would settle it

Direct comparison of the camera-derived height and arm span values against manual tape measurements taken on the same participants.

Figures

Figures reproduced from arXiv: 2606.11578 by Jinghao Yang, Juan Lopez Alvarenga, Lois Akosua Serwaa, Martha Asare, Xuan Wang.

**Figure 1.** Figure 1: Proposed pipeline for contactless human body measurement using depth-camera data. RGB and depth images were captured using an Orbbec Astra 2 sensor. The depth image was converted into a 3D point cloud, followed by person segmentation to isolate the human subject. Anthropometric measurements, including height, arm span, body volume, and visible surface area, were estimated from the segmented point cloud. B.… view at source ↗

**Figure 2.** Figure 2: Point cloud processing and person segmentation. The raw 3D point cloud captured from the depth camera contains both human subjects and the surrounding environment. Segmentation techniques are applied to isolate the human body from background objects for further anthropometric analysis. III. EXPERIMENTAL RESULTS This section outlines the experimental findings obtained using the proposed contactless body mea… view at source ↗

**Figure 3.** Figure 3: Estimation of anthropometric measurements from a segmented point cloud. Key anatomical landmarks, including the top of the head, bottom of the feet, and both hands, were identified to calculate body measurements. The estimated height and arm span were projected onto an RGB image using the camera intrinsic parameters for visualization and verification. B. Body Volume and Surface Area Estimation In addition … view at source ↗

read the original abstract

Contactless body measurement technologies are becoming increasingly significant for smart health monitoring, digital health applications, and remote patient assessment. Traditional anthropometric measurements typically necessitate physical contact and trained personnel, which may constrain scalability in remote healthcare settings. In this study, we introduce a depth camera-based framework for estimating human body measurements utilizing 3D point cloud data. An Orbbec Astra 2 depth camera was employed to capture RGB images, depth maps, and 3D point clouds of participants. The captured point cloud was processed using Python-based tools, including Open3D, NumPy, and OpenCV, to segment the human body from the background. Key anthropometric measurements, such as height and arm span, were computed. The measurements were obtained through a combination of spatial filtering and landmark selection on the 3D point cloud, followed by the projection of the computed measurements onto the corresponding RGB image using camera intrinsic parameters. In addition to linear measurements, the approximate body volume and visible surface area were estimated using voxel-based occupancy analysis and mesh-based surface reconstruction methods. The experimental results from a single depth capture demonstrated that accurate body measurements and geometric estimates could be obtained from depth camera data without physical contact. This study provides a foundation for future real-time systems that integrate depth sensing with intelligent health monitoring and generative AI models for smart healthcare applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a basic implementation description of depth-camera body measurement using off-the-shelf tools, with no validation data or error metrics to support the accuracy claims.

read the letter

The main takeaway is that the paper outlines a pipeline for contactless anthropometry with an Orbbec Astra 2 camera and standard libraries, but supplies no numbers showing how well the outputs match physical reality.

The work processes RGB, depth, and point cloud data to segment the body, select landmarks for height and arm span, project measurements onto the RGB image via intrinsics, and estimate volume through voxels plus surface area through mesh reconstruction. All steps rely on Open3D, NumPy, and OpenCV. Nothing in the methods is new; these are routine applications of existing point-cloud tools to a known use case. The paper does lay out the full flow clearly and notes the potential for real-time health monitoring systems, which is a reasonable framing for an application note.

The central problem is the missing validation. The abstract asserts that accurate measurements were obtained from a single capture, yet the description gives no participant count, no manual reference measurements, no MAE or RMSE values, and no specifics on landmark selection or filtering thresholds. Without those, the accuracy claim cannot be checked. The stress-test concern is accurate based on what is shown.

This paper would mainly interest engineers or students prototyping remote monitoring setups who want a concrete example of wiring the libraries together. Researchers looking for new algorithms, theoretical results, or clinically tested methods will not find them here.

I would not bring it to a reading group. I would not cite it. It does not merit peer review until the authors add a results section with quantitative comparisons to ground truth.

Referee Report

2 major / 0 minor

Summary. The paper presents a framework for contactless anthropometric measurement using an Orbbec Astra 2 depth camera. RGB images, depth maps, and 3D point clouds are captured; the point cloud is segmented from the background with Open3D, NumPy, and OpenCV; linear measurements (height, arm span) are obtained via spatial filtering and landmark selection on the point cloud followed by projection onto the RGB image using camera intrinsics; volume and surface area are estimated with voxel occupancy and mesh reconstruction. The abstract asserts that these steps yield accurate measurements from a single capture.

Significance. A validated, fully contactless pipeline for body measurements would be useful for scalable remote health monitoring. The manuscript, however, supplies no quantitative validation, so the significance cannot yet be assessed.

major comments (2)

[Abstract] Abstract: the claim that 'accurate body measurements and geometric estimates could be obtained' from a single depth capture is unsupported; no participant count, no ground-truth reference measurements, and no error statistics (MAE, RMSE, or similar) are reported anywhere in the manuscript.
[Methods / Experimental Results] Methods / Experimental Results: the spatial filtering, landmark selection on the point cloud, and subsequent RGB projection steps are described only at a high level; no thresholds, selection criteria, or explicit algorithm for landmark identification are given, so it is impossible to determine whether the computed distances match physical dimensions.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments. We acknowledge that the abstract overclaims accuracy without supporting data and that the methods are described at a high level. We respond point-by-point below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'accurate body measurements and geometric estimates could be obtained' from a single depth capture is unsupported; no participant count, no ground-truth reference measurements, and no error statistics (MAE, RMSE, or similar) are reported anywhere in the manuscript.

Authors: We agree the claim of 'accurate' measurements is unsupported. The manuscript describes a framework and a single-capture demonstration using standard libraries but contains no participant cohort, ground-truth comparisons, or error metrics. We will revise the abstract to remove the word 'accurate' and state only that the framework enables estimation of linear dimensions, volume, and area from depth data. We cannot add quantitative validation results because no such experiments were conducted. revision: yes
Referee: [Methods / Experimental Results] Methods / Experimental Results: the spatial filtering, landmark selection on the point cloud, and subsequent RGB projection steps are described only at a high level; no thresholds, selection criteria, or explicit algorithm for landmark identification are given, so it is impossible to determine whether the computed distances match physical dimensions.

Authors: The current text gives only a conceptual description. We will expand the Methods section with concrete implementation details, including the distance thresholds applied for background removal, the criteria used to select landmarks (e.g., extremal points along principal axes of the segmented point cloud), and the exact projection equations that map 3D points to the RGB image using the camera intrinsics. Pseudocode or parameter values from the original code will be added where available. revision: yes

standing simulated objections not resolved

No quantitative validation data (participant count, ground-truth measurements, or error statistics) exist in the original work; these cannot be supplied without performing new experiments.

Circularity Check

0 steps flagged

No circularity: implementation description with no derivations or fitted predictions

full rationale

The paper presents a straightforward pipeline for processing depth camera point clouds using off-the-shelf libraries (Open3D, NumPy, OpenCV) to compute linear measurements, volume, and surface area. No equations, fitted parameters, predictions, or self-citations appear in the derivation chain. The accuracy claim lacks ground-truth validation, but this is an evidentiary gap rather than a circular reduction where a result is defined by or forced from its own inputs. The work is self-contained as a methods description without any load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Work depends on standard assumptions of depth camera calibration, accurate background segmentation via spatial filtering, and that selected landmarks correspond to anatomical points; no free parameters, new axioms, or invented entities are introduced.

axioms (2)

domain assumption Depth camera provides accurate 3D coordinates after intrinsic calibration
Invoked when projecting measurements onto RGB image using camera intrinsics
domain assumption Spatial filtering and landmark selection isolate true body geometry from background and noise
Central to segmenting the human body from the point cloud

pith-pipeline@v0.9.1-grok · 5782 in / 1283 out tokens · 26958 ms · 2026-06-27T10:42:57.809083+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 1 canonical work pages · 1 internal anchor

[1]

Open3D: A Modern Library for 3D Data Processing

Q.-Y. Zhou, J. Park, and V. Koltun, "Open3D: A modern library for 3D data processing," arXiv preprint arXiv:1801.09847, 2018. [14] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "Pointnet: Deep learning on point sets for 3d classification and segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652-660. [15...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/s17020243 2018
[2]

Camera Calibration and 3D Reconstruction (OpenCV Documentation)

OpenCV. "Camera Calibration and 3D Reconstruction (OpenCV Documentation)." OpenCV.org. https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html (accessed 30 January 2025). [24] Z. Zhang, "Camera calibration," in Computer vision: a reference guide: Springer, 2021, pp. 130-131. [25] E. Howley, S. Francis, and D. Schluppeck, "fRAT: an interactive, Python-based...

2025

[1] [1]

Open3D: A Modern Library for 3D Data Processing

Q.-Y. Zhou, J. Park, and V. Koltun, "Open3D: A modern library for 3D data processing," arXiv preprint arXiv:1801.09847, 2018. [14] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "Pointnet: Deep learning on point sets for 3d classification and segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652-660. [15...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/s17020243 2018

[2] [2]

Camera Calibration and 3D Reconstruction (OpenCV Documentation)

OpenCV. "Camera Calibration and 3D Reconstruction (OpenCV Documentation)." OpenCV.org. https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html (accessed 30 January 2025). [24] Z. Zhang, "Camera calibration," in Computer vision: a reference guide: Springer, 2021, pp. 130-131. [25] E. Howley, S. Francis, and D. Schluppeck, "fRAT: an interactive, Python-based...

2025