pith. machine review for the scientific record. sign in

arxiv: 2604.12949 · v1 · submitted 2026-04-14 · 💻 cs.HC

Recognition: unknown

GlintMarkers: Spatial Perception on XR Eyewear using Corneal Reflections

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:26 UTC · model grok-4.3

classification 💻 cs.HC
keywords corneal reflectionXR eyewearspatial perceptionretroreflective markersgaze-driven sensingPnP estimationnear-infrared glintsobject tracking
0
0 comments X

The pith

GlintMarkers uses reflections from passive markers on the cornea to let XR eyewear cameras estimate object positions and orientations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GlintMarkers as the first method for gaze-driven spatial perception that relies solely on inward-facing cameras already present in XR eyewear. It rests on the observation that the cornea reflects light from the environment and can therefore carry information about nearby objects when those objects are fitted with passive retroreflective markers. The markers are designed to concentrate near-infrared light into bright glint patterns that remain visible despite the camera's small pixel count. A custom Perspective-n-Point solver then converts the observed glint geometry into estimates of distance, orientation, and unique identity for each tagged object. If this pipeline succeeds, XR systems gain a new sensing channel for the physical world without requiring outward cameras or powered tags.

Core claim

The cornea functions as a compact mirror that encodes both the user's gaze direction and visual details of the surrounding scene. Passive retroreflective markers placed on objects concentrate reflected near-infrared light into distinct glint patterns on this mirror surface. These patterns are captured by the inward-facing camera and fed into a custom Perspective-n-Point estimation framework adapted for corneal geometry, yielding reliable measurements of object orientation, distance, and unique identification.

What carries the argument

The passive retroreflective marker design that produces concentrated bright glints on the cornea, paired with a custom Perspective-n-Point (PnP) estimation framework adapted to the geometry of corneal reflections.

If this is right

  • XR eyewear can determine the three-dimensional position and orientation of tagged objects using only its existing inward-facing cameras.
  • Multiple objects can be uniquely identified and tracked simultaneously through their distinct glint signatures.
  • Spatial perception becomes possible without outward-facing cameras or active electronic tags on the environment.
  • Gaze direction and environmental layout are recovered from the same corneal image stream.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be tested with untagged everyday objects by analyzing natural specular highlights instead of engineered markers.
  • Integration with existing eye-tracking pipelines in commercial XR headsets would require only software changes rather than new hardware.
  • The same glint data might support additional inferences such as surface material or rough shape once more sophisticated decoding is added.
  • Privacy implications arise because the system records reflections of the user's immediate surroundings on the eye itself.

Load-bearing premise

The cornea must act as a sufficiently faithful mirror whose small, low-contrast reflections can be made bright enough by passive markers to support accurate spatial calculations with the limited resolution of an inward-facing camera.

What would settle it

A controlled experiment that measures the root-mean-square error of distance and orientation estimates against ground-truth motion-capture data while varying lighting conditions and marker distances would show whether the glint patterns contain enough geometric information for reliable PnP results.

Figures

Figures reproduced from arXiv: 2604.12949 by Chris Harrison, Justin Chan, Mayank Goel, Seungjoo Lee, Vimal Mollyn.

Figure 1
Figure 1. Figure 1: GlintMarkers enables spatial perception of the physical world using only an inward-facing eye camera. By capturing corneal reflections of retroreflective markers placed on everyday objects, GlintMarkers identify the target object and estimate its 3D orientation and distance relative to the user. Abstract We present GlintMarkers, the first system to perform gaze-driven spatial perception using the inward-fa… view at source ↗
Figure 2
Figure 2. Figure 2: Example applications of GlintMarkers [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Experimental setup for characterizing retroreflective marker design parameters. (A) Retroreflective patches of three [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Left: Detection accuracy of different sized retrore [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 7
Figure 7. Figure 7: GlintMarkers layout. A N×N grid of retroreflective patches encodes object ID and provides fiducial anchors for PnP pose estimation (3-DoF orientation and distance). Exam￾ple 3 × 3 and 4 × 4 markers are shown. detected if all individual patches are visible in the corneal reflections to a human annotator. Frames affected by blinks were excluded from analysis [PITH_FULL_IMAGE:figures/full_fig_p004_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of a conventional ArUco marker and a [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: High-level overview of GlintMarkers’s processing pipeline. While three non-collinear points is the theoretical minimum needed to solve PnP, using four corners provides a more robust overde￾termined system. To resolve rotational ambiguity, the top-right corner patch is larger than the others, establishing a canonical orientation. We use a 20 mm orientation patch for the short- to mid￾range marker and a 30 m… view at source ↗
Figure 9
Figure 9. Figure 9: Corneal reflections for markers at different orientations. Rotational shifts in the marker are reflected in the glint [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Orientation estimation MAE for the 24 cm tag span [PITH_FULL_IMAGE:figures/full_fig_p007_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Calibrated distance estimation accuracy for the [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 14
Figure 14. Figure 14: GlintMarkers attached to curved objects. Top: ex￾amples on a chair and an electric kettle. Bottom: correspond￾ing corneal reflections showing that the markers remain visible on moderately curved surfaces. 5 Discussion Curved surfaces. We tested whether GlintMarkers remain visible when attached to curved objects. Specifically, we placed markers on a curved chair back and on the side of an electric kettle. … view at source ↗
Figure 13
Figure 13. Figure 13: Object identification accuracy. Per-frame (top) and [PITH_FULL_IMAGE:figures/full_fig_p009_13.png] view at source ↗
Figure 15
Figure 15. Figure 15: Corneal reflection under iPhone LiDAR dot pro [PITH_FULL_IMAGE:figures/full_fig_p010_15.png] view at source ↗
read the original abstract

We present GlintMarkers, the first system to perform gaze-driven spatial perception using the inward-facing cameras on XR eyewear. Our key observation is that the cornea acts as a mirror that encodes both gaze direction and visual information about the environment in a small, low-contrast reflection. To extract spatial and semantic information from this reflection despite the camera's limited pixel budget, we present a passive retroreflective marker design that concentrates reflected near-infrared light onto the cornea, producing bright glint patterns. We develop a custom Perspective-n-Point (PnP) estimation framework adapted to corneal imaging and perform orientation and distance estimation of tagged objects, as well as unique object identification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents GlintMarkers, the first system for gaze-driven spatial perception on XR eyewear that repurposes inward-facing cameras to capture corneal reflections of passive retroreflective markers. It introduces a marker design to concentrate NIR light into bright glint patterns and a custom PnP framework to recover 6-DoF pose (orientation and distance) of tagged objects along with unique identification, despite the limited pixel budget of the corneal reflection.

Significance. If the core technical assumptions hold, the work would offer a hardware-minimal approach to environmental sensing in XR by exploiting existing eye-tracking cameras and corneal optics, potentially enabling new gaze-based spatial interactions. The novelty of the retroreflective marker design and corneal-adapted PnP is notable, but the lack of any reported error metrics or validation data substantially reduces the assessed significance at present.

major comments (3)
  1. [Abstract and §4 (PnP Framework)] The abstract and introduction claim reliable orientation and distance estimation via the custom PnP framework, yet no quantitative results (e.g., mean reprojection error, angular accuracy, or distance error) are provided anywhere in the manuscript to demonstrate that the solver converges on the low-resolution, low-contrast glints.
  2. [§3 (Marker Design)] §3 (Marker Design): The feasibility claim that retroreflective markers produce glint patterns bright enough and geometrically stable enough for PnP rests on an untested assumption about signal-to-noise ratio; the text acknowledges the limited pixel budget but supplies no pixel-occupancy measurements, SNR analysis, or ambient-light robustness tests.
  3. [§4 (PnP Framework) and §5 (Evaluation)] The central 6-DoF estimation claim depends on the cornea acting as a sufficiently spherical mirror whose radius and asphericity variations across users do not push detected correspondences outside the PnP basin of convergence; no user-variability study or calibration-drift analysis is reported.
minor comments (2)
  1. [§4] Notation for the custom PnP objective function and the retroreflective marker geometry should be defined more explicitly with equations rather than prose descriptions.
  2. [Figures 3-5] Figure captions for the glint-pattern examples would benefit from scale bars or pixel-count annotations to illustrate the limited resolution.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments correctly identify gaps in quantitative validation and analysis that we will address through targeted revisions and additional experiments. Below we respond point by point to the major comments.

read point-by-point responses
  1. Referee: [Abstract and §4 (PnP Framework)] The abstract and introduction claim reliable orientation and distance estimation via the custom PnP framework, yet no quantitative results (e.g., mean reprojection error, angular accuracy, or distance error) are provided anywhere in the manuscript to demonstrate that the solver converges on the low-resolution, low-contrast glints.

    Authors: We agree that the manuscript would be strengthened by explicit quantitative error metrics. While §5 presents functional demonstrations of 6-DoF estimation and identification, it does not report numerical values such as mean reprojection error or angular/distance accuracy. In the revision we will add these metrics, computed from our existing experimental datasets, to §4 and §5 to directly substantiate convergence and reliability on the low-resolution glints. revision: yes

  2. Referee: [§3 (Marker Design)] §3 (Marker Design): The feasibility claim that retroreflective markers produce glint patterns bright enough and geometrically stable enough for PnP rests on an untested assumption about signal-to-noise ratio; the text acknowledges the limited pixel budget but supplies no pixel-occupancy measurements, SNR analysis, or ambient-light robustness tests.

    Authors: The referee is correct that §3 relies on optical principles without supporting measurements. We will revise §3 to include pixel-occupancy statistics for the glint patterns, quantitative SNR values under controlled and ambient lighting, and robustness test results. These additions will be drawn from supplementary optical characterization experiments we will perform and report. revision: yes

  3. Referee: [§4 (PnP Framework) and §5 (Evaluation)] The central 6-DoF estimation claim depends on the cornea acting as a sufficiently spherical mirror whose radius and asphericity variations across users do not push detected correspondences outside the PnP basin of convergence; no user-variability study or calibration-drift analysis is reported.

    Authors: We acknowledge the absence of a dedicated user-variability study. Our current evaluation uses a fixed corneal model and limited participant data. In the revision we will expand §5 with a multi-user study that measures the effects of corneal radius and asphericity variation on correspondence accuracy and PnP convergence, together with an analysis of calibration drift over time. This will be accompanied by updated discussion of model assumptions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external CV primitives and empirical validation

full rationale

The paper's core pipeline—retroreflective marker design to produce corneal glints, followed by an adapted PnP solver for 6-DoF pose—applies standard computer-vision techniques (Perspective-n-Point) to a new imaging modality. No equation or claim reduces by construction to a fitted parameter defined by the result itself, nor does any load-bearing step rest on a self-citation chain whose prior work is unverified. The abstract and described framework treat the cornea's reflective properties and marker concentration as physical observations to be validated experimentally, not as tautological inputs. The derivation is therefore self-contained against external benchmarks such as reprojection error on real corneal images and marker visibility measurements.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities beyond standard assumptions in computer vision and optics. The marker design and PnP adaptation are presented as novel contributions rather than fitted or postulated entities.

pith-pipeline@v0.9.0 · 5415 in / 1076 out tokens · 23341 ms · 2026-05-10T14:26:26.342441+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 9 canonical work pages

  1. [1]

    AliExpress. n.d.. 8MP 4K IMX678 USB Camera Module. https://www.aliexpress. us/item/3256811395981728.html. Accessed: 2026-04-01

  2. [2]

    Hadi Alzayer, Kevin Zhang, Brandon Feng, Christopher A Metzler, and Jia-Bin Huang. 2024. Seeing the world through your eyes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4864–4873

  3. [3]

    Apple. 2026. Apple Vision Pro. https://www.apple.com/apple-vision-pro/

  4. [4]

    Ananta Narayanan Balaji, Clayton Kimber, David Li, Shengzhi Wu, Ruofei Du, and David Kim. 2023. RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented Reality.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.6, 4, Article 157 (Jan. 2023), 36 pages. doi:10.1145/3569479

  5. [5]

    Toby Collins and Adrien Bartoli. 2014. Infinitesimal plane-based pose estimation. International journal of computer vision109, 3 (2014), 252–286

  6. [6]

    Mustafa Doga Dogan, Raul Garcia-Martin, Patrick William Haertel, Jamison John O’Keefe, Ahmad Taka, Akarsh Aurora, Raul Sanchez-Reillo, and Stefanie Mueller

  7. [7]

    In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23)

    BrightMarker: 3D Printed Fluorescent Markers for Object Tracking. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 55, 13 pages. doi:10.1145/3586183.3606758

  8. [8]

    Mustafa Doga Dogan, Ahmad Taka, Michael Lu, Yunyi Zhu, Akshat Kumar, Aakar Gupta, and Stefanie Mueller. 2022. InfraredTags: Embedding Invisible AR Markers and Barcodes Using Low-Cost, Infrared-Based 3D Printing and Imaging Tools. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA)(CHI ’22). Association fo...

  9. [9]

    Anjith George and Aurobinda Routray. 2016. Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images.IET Computer Vision10, 7 (2016), 660–669

  10. [10]

    Michail Kalaitzakis, Brennan Cain, Sabrina Carroll, Anand Ambrosi, Camden Whitehead, and Nikolaos Vitzilaios. 2021. Fiducial markers for pose estimation: Overview, applications and experimental comparison of the artag, apriltag, aruco and stag markers.Journal of Intelligent & Robotic Systems101, 4 (2021), 71

  11. [11]

    Tharindu Kaluarachchi, Shamane Siriwardhana, Elliott Wen, and Suranga Nanayakkara. 2023. A Corneal Surface Reflections-Based Intelligent System for Lifelogging Applications.International Journal of Human–Computer Interac- tion39, 9 (2023), 1963–1980. arXiv:https://doi.org/10.1080/10447318.2022.2163240 doi:10.1080/10447318.2022.2163240

  12. [12]

    Kato and M

    H. Kato and M. Billinghurst. 1999. Marker tracking and HMD calibration for a video-based augmented reality conferencing system. InProceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IW AR’99). 85–94. doi:10. 1109/IWAR.1999.803809

  13. [13]

    The story of life is quicker than the blink of an eye

    Christian Lander, Antonio Krüger, and Markus Löchtefeld. 2016. "The story of life is quicker than the blink of an eye": using corneal imaging for life log- ging. InProceedings of the 2016 ACM International Joint Conference on Per- vasive and Ubiquitous Computing: Adjunct(Heidelberg, Germany)(UbiComp ’16). Association for Computing Machinery, New York, NY,...

  14. [14]

    LEDGUHON. n.d.. 10Pcs 850nm High Power SMD LED Chip 3W Led Emitting Diode Infrared Emitting Quadruple Chip Use for Infrared Transmitter (120deg, IR 850nm–856nm, 700mA). https://www.amazon.com/dp/B0CJ5C8TCF. Amazon product page. Accessed: 2026-04-01

  15. [15]

    Meta. 2026. Meta Orion. https://www.meta.com/emerging-tech/orion

  16. [16]

    MidOpt. n.d.. BP850 Near-IR Bandpass Filter for Machine Vision. https://midopt. com/filters/bp850/. Accessed: 2026-04-01

  17. [17]

    Atsushi Nakazawa, Christian Nitschke, and Toyoaki Nishida. 2016. Registration of eye reflection and scene images using an aspherical eye model.J. Opt. Soc. Am. A33, 11 (Nov 2016), 2264–2276. doi:10.1364/JOSAA.33.002264

  18. [18]

    Ko Nishino and Shree K Nayar. 2004. Eyes for relighting.ACM Transactions on Graphics (TOG)23, 3 (2004), 704–711

  19. [19]

    Christian Nitschke and Atsushi Nakazawa. 2012. Super-Resolution from Corneal Images. InBMVC. 1–12

  20. [20]

    Christian Nitschke, Atsushi Nakazawa, and Haruo Takemura. 2013. Corneal imaging revisited: An overview of corneal reflection analysis and applications. IPSJ Transactions on Computer Vision and Applications5 (2013), 1–18

  21. [21]

    Yuki Ohshima, Kyosuke Maeda, Yusuke Edamoto, and Atsushi Nakazawa. 2021. Visual place recognition from eye reflection.IEEE Access9 (2021), 57364–57371

  22. [22]

    Edwin Olson. 2011. AprilTag: A robust and flexible visual fiducial system. In2011 IEEE international conference on robotics and automation. IEEE, 3400–3407

  23. [23]

    OpenCV. 2026. Blob Detection using OpenCV. https://opencv.org/blob-detection- using-opencv/

  24. [24]

    Erich Schubert, Jörg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu

  25. [25]

    ACM transactions on database systems (tods)42, 3 (2017), 1–21

    DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM transactions on database systems (tods)42, 3 (2017), 1–21

  26. [26]

    Nusrat Sharmin and Remus Brad. 2012. Optimal filter estimation for Lucas- Kanade optical flow.Sensors12, 9 (2012), 12694–12709

  27. [27]

    Huiqiong Wang, Stephen Lin, Xiuqing Ye, and Weikang Gu. 2008. Separating corneal reflections for illumination estimation.Neurocomputing71, 10-12 (2008), 1788–1797

  28. [28]

    John Wang and Edwin Olson. 2016. AprilTag 2: Efficient and robust fiducial detection. In2016 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 4193–4198

  29. [29]

    Sunu Wibirama, Igi Ardiyanto, Thoriq Satriya, Teguh Bharata Adji, Noor Akhmad Setiawan, and Muhammad Taufiq Setiawan. 2019. An improved pupil localization technique for real-time video-oculography under extreme eyelid occlusion.In- ternational Journal of Innovative Computing, Information and Control15, 4 (2019), 1547–1563

  30. [30]

    Flana- gin, Peter zu Eulenburg, and Seyed-Ahmad Ahmadi

    Yuk-Hoi Yiu, Moustafa Aboulatta, Theresa Raiser, Leoni Ophey, Virginia L. Flana- gin, Peter zu Eulenburg, and Seyed-Ahmad Ahmadi. 2019. DeepVOG: Open- source pupil segmentation and gaze estimation in neuroscience using deep learn- ing.Journal of Neuroscience Methods324 (2019), 108307. doi:10.1016/j.jneumeth. 2019.05.016