pith. machine review for the scientific record. sign in

arxiv: 2605.14444 · v1 · submitted 2026-05-14 · 📊 stat.ME

Recognition: no theorem link

Inlier Recovery for Robust Registration via Gram-Matrix Overlap

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:01 UTC · model grok-4.3

classification 📊 stat.ME
keywords inlier recoveryrobust registrationGram matrix overlappoint set registrationexact recoveryrow-sum methodeigenvector matchingoutlier handling
0
0 comments X

The pith

The row-sum method recovers exact inlier matches from Gram-matrix overlaps even when inliers drop to order sqrt(n) and their fraction vanishes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper converts the problem of spotting matched points between two noisy datasets into a structured recovery task by taking the Hadamard product of their Gram matrices. This matrix highlights consistent pairwise distances among true matches while diluting signals from outliers. Two simple procedures follow: one extracts the leading eigenvector of the overlap matrix, the other sums rows and thresholds the totals. The row-sum procedure is shown to succeed exactly whenever dimension and sample size are comparable, even if only sqrt(n) points are inliers up to logarithmic factors. A reader would care because many practical alignment tasks, such as medical-image registration, routinely face heavy outlier contamination that defeats methods requiring a fixed positive fraction of good matches.

Core claim

By forming the Hadamard product of the two Gram matrices, the inlier-recovery task becomes one of identifying a structured submatrix whose row sums or leading eigenvector separate the matched points from the rest; the row-sum procedure recovers this set exactly under the scaling regime where dimension equals sample size and the number of inliers is order sqrt(n) (log factors), whereas the eigenvector procedure only guarantees weak recovery in the same regime.

What carries the argument

The Gram-matrix overlap (Hadamard product of the two point-set Gram matrices), which encodes consistent inner-product structure among inliers and produces an observable signal that row sums or the leading eigenvector can threshold to label matches.

If this is right

  • Registration pipelines can proceed after exact inlier identification without first assuming a non-vanishing fraction of good matches.
  • The same overlap construction yields a parallelizable algorithm suitable for large-scale point sets.
  • Exact recovery holds across a wider range of dimension-to-sample-size ratios than eigenvector-based alternatives.
  • Numerical tests on brain imaging volumes confirm that the recovered matches suffice for reliable alignment under heavy corruption.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same Gram-overlap idea could be tested on other correspondence problems such as feature matching in computer vision or node alignment in networks.
  • If the separation signal persists under milder noise assumptions, the row-sum threshold might be replaced by a data-driven rule that avoids any tuning parameter.
  • The sqrt(n) inlier scaling suggests the method could remain useful in regimes where only a sparse set of reliable anchors exists.

Load-bearing premise

The overlap matrix must carry a sufficiently strong and structured signal that separates the inlier block from outlier noise under the paper's implicit noise and outlier models.

What would settle it

Generate two point clouds in dimension n with exactly c sqrt(n) inliers for small constant c, add the noise and outlier levels stated in the proofs, apply the row-sum procedure, and check whether the recovered set equals the true inlier set with high probability.

Figures

Figures reproduced from arXiv: 2605.14444 by Ruizi Wu, Wanjie Wang, YueHaw Khoo.

Figure 1
Figure 1. Figure 1: Performance across different inlier proportions on the brain dataset, with [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: From left to right, we show the original sky photo, the changed sky photo, and the [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance as the noise level increases for the two sky-data scenarios. Left panel: [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The yellow-highlighted pixels are the ones identified by the algorithm as extra items. The [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

Robust point-set registration in the presence of noise and outliers is challenging because the matched points (inliers) must be identified before reliable alignment can be performed. Existing robust registration methods typically optimize over the transformation space and are often designed for regimes with a nonvanishing fraction of inliers. In this paper, we study the inlier recovery problem arising in robust registration by comparing two datasets through the Hadamard product of their Gram matrices. This formulation converts the inlier identification into a structured recovery problem and avoids direct optimization over the rotation group. Based on this idea, we develop two methods: an eigenvector matching method based on the leading eigenvector of the Gram-matrix overlap, and a row-sum matching method based on aggregated entrywise comparison. We show that the eigenvector method achieves weak recovery when the dimension and sample size are of the same order, while the row-sum method achieves exact recovery under a broader range of dimensional scalings. In particular, when the dimension is comparable to the sample size, exact recovery is possible even when the inlier fraction vanishes, with the number of inliers as small as order $\sqrt{n}$, up to logarithmic factors. We also discuss a parallel implementation for large-scale settings. Numerical experiments on brain imaging data and image examples demonstrate that the proposed methods effectively identify matched structure under substantial corruption.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper formulates inlier recovery for robust point-set registration as a structured recovery task via the Hadamard product of Gram matrices from two point clouds. It introduces an eigenvector-matching method that achieves weak recovery when dimension d and sample size n are of the same order, and a row-sum method that achieves exact recovery under broader scalings, including exact recovery with k = Θ(√n) inliers (up to logs) when d ∼ n even as the inlier fraction vanishes. The work also discusses a parallel implementation and reports numerical experiments on brain imaging data and image examples.

Significance. If the claimed recovery guarantees are established with explicit noise models and valid concentration arguments, the reduction to Gram-matrix overlap offers a computationally attractive alternative to direct optimization over the rotation group, with the row-sum method's ability to handle vanishing inlier fractions representing a potentially useful advance for high-dimensional registration tasks. The parallel implementation is a practical strength for large-scale settings.

major comments (1)
  1. [Abstract / row-sum method] The exact-recovery claim for the row-sum method (k = Θ(√n) inliers when d ∼ n) requires that the minimum inlier row-sum of the Gram-matrix overlap strictly exceeds the maximum outlier row-sum with high probability. Because Gram matrices have rank at most d, the entries of their Hadamard product are dependent; standard sub-Gaussian tail bounds therefore do not apply directly to the row sums without additional incoherence or moment assumptions on the point clouds. The abstract states no explicit noise or outlier model and provides no concentration lemma establishing the required separation, leaving the central exact-recovery guarantee unsupported in the given description.
minor comments (2)
  1. [Abstract] The abstract mentions numerical experiments on brain imaging data but supplies no quantitative details (inlier fractions tested, error metrics, or baselines), making it impossible to assess empirical support for the theoretical claims.
  2. [Method description] Notation for the Gram-matrix overlap and the precise definition of the row-sum statistic should be introduced with an equation in the main text to avoid ambiguity when comparing the two methods.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We appreciate the positive assessment of the Gram-matrix overlap approach and address the major comment below.

read point-by-point responses
  1. Referee: [Abstract / row-sum method] The exact-recovery claim for the row-sum method (k = Θ(√n) inliers when d ∼ n) requires that the minimum inlier row-sum of the Gram-matrix overlap strictly exceeds the maximum outlier row-sum with high probability. Because Gram matrices have rank at most d, the entries of their Hadamard product are dependent; standard sub-Gaussian tail bounds therefore do not apply directly to the row sums without additional incoherence or moment assumptions on the point clouds. The abstract states no explicit noise or outlier model and provides no concentration lemma establishing the required separation, leaving the central exact-recovery guarantee unsupported in the given description.

    Authors: We agree that the abstract is too concise and does not mention the noise/outlier model or reference the supporting concentration result. The full manuscript (Section 2) defines an explicit model: inliers satisfy X_i = R Y_i + noise with isotropic Gaussian noise of variance σ², while outliers are drawn uniformly from the unit sphere. Theorem 4.3 proves exact recovery for the row-sum method when d ∼ n and k = Ω(√n log n), using a matrix Bernstein inequality that accounts for the rank-d dependence through incoherence assumptions on the point clouds (Assumption 3.1). The proof establishes the required row-sum separation with high probability. We will revise the abstract to briefly state the model and refer to Theorem 4.3 for the exact-recovery guarantee. This change will appear in the next version of the manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper introduces a Gram-matrix overlap formulation to convert inlier identification into a structured recovery problem, then analyzes two methods (eigenvector matching and row-sum matching) with claimed recovery guarantees. No equations or steps in the provided abstract reduce the claimed exact recovery thresholds (e.g., k=Θ(√n) inliers when d~n) to self-definitions, fitted inputs renamed as predictions, or load-bearing self-citations. The central claims rest on concentration arguments for the overlap matrix that are presented as derived rather than tautological, making the analysis self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on domain assumptions about point embeddings and outlier models that allow the overlap matrix to separate inliers; no free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Point sets lie in Euclidean space so that Gram matrices capture pairwise inner products
    Standard modeling choice for registration problems
  • domain assumption Inliers produce consistent Gram entries while outliers do not, enabling separation via overlap
    Required for both eigenvector and row-sum recovery to succeed

pith-pipeline@v0.9.0 · 5535 in / 1284 out tokens · 44026 ms · 2026-05-15T02:01:34.869079+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    Finding a large hidden clique in a random graph.Random Structures & Algorithms, 13(3–4):457–466, 1998

    Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique in a random graph.Random Structures & Algorithms, 13(3–4):457–466, 1998

  2. [2]

    Community detection in dense random networks.The Annals of Statistics, 42(3):940–969, 2014

    Ery Arias-Castro and Nicolas Verzelen. Community detection in dense random networks.The Annals of Statistics, 42(3):940–969, 2014

  3. [3]

    K. S. Arun, T. S. Huang, and S. D. Blostein. Least-squares fitting of two 3-d point sets.IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-9(5):698–700, 1987

  4. [4]

    Maximizing robustness of point-set registration by leveraging non-convexity.arXiv preprint arXiv:2004.08772, 2020

    Cindy Orozco Bohorquez, Yuehaw Khoo, and Lexing Ying. Maximizing robustness of point-set registration by leveraging non-convexity.arXiv preprint arXiv:2004.08772, 2020

  5. [5]

    Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, Jos´ e Neira, Ian Reid, and John J. Leonard. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age.IEEE Transactions on Robotics, 32(6):1309–1332, 2016

  6. [6]

    Simultaneous localization and mapping: Part i.IEEE Robotics & Automation Magazine, 13(2):99–110, 2006

    Hugh Durrant-Whyte and Tim Bailey. Simultaneous localization and mapping: Part i.IEEE Robotics & Automation Magazine, 13(2):99–110, 2006

  7. [7]

    The spectral norm of random inner-product kernel matrices

    Zhou Fan and Andrea Montanari. The spectral norm of random inner-product kernel matrices. Probability Theory and Related Fields, 173:27–85, 2019

  8. [8]

    Freesurfer.NeuroImage, 62(2):774–781, 2012

    Bruce Fischl. Freesurfer.NeuroImage, 62(2):774–781, 2012

  9. [9]

    Glasser, Stamatios N

    Matthew F. Glasser, Stamatios N. Sotiropoulos, J. Anthony Wilson, Timothy S. Coal- son, Bruce Fischl, Jesper L. R. Andersson, Junqian Xu, Saad Jbabdi, Matthew Webster, Jonathan R. Polimeni, David C. Van Essen, and Mark Jenkinson. The minimal preprocessing pipelines for the human connectome project.NeuroImage, 80:105–124, 2013

  10. [10]

    Glasser and David C

    Matthew F. Glasser and David C. Van Essen. Mapping human cortical areas in vivo based on myelin content as revealed by t1- and t2-weighted mri.Journal of Neuroscience, 31(32):11597– 11616, 2011

  11. [11]

    Unsupervised alignment of embeddings with Wasserstein Procrustes

    Edouard Grave, Armand Joulin, and Quentin Berthet. Unsupervised alignment of embeddings with Wasserstein Procrustes. InProceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS), volume 89 ofProceedings of Machine Learning Research, pages 1880–1890. PMLR, 2019

  12. [12]

    Computational lower bounds for community detection on random graphs

    Bruce Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for community detection on random graphs. InProceedings of The 28th Conference on Learning Theory (COLT), volume 40 ofProceedings of Machine Learning Research, pages 899–928. PMLR, 2015

  13. [13]

    Cam- bridge University Press, 2 edition, 2004

    Richard Hartley and Andrew Zisserman.Multiple View Geometry in Computer Vision. Cam- bridge University Press, 2 edition, 2004

  14. [14]

    Convex relaxations of se (2) and se (3) for visual pose estimation

    Matanya B Horowitz, Nikolai Matni, and Joel W Burdick. Convex relaxations of se (2) and se (3) for visual pose estimation. In2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1148–1154. IEEE, 2014. 17

  15. [15]

    Bannister, Michael Brady, and Stephen M

    Mark Jenkinson, Peter R. Bannister, Michael Brady, and Stephen M. Smith. Improved opti- misation for the robust and accurate linear registration and motion correction of brain images. NeuroImage, 17(2):825–841, 2002

  16. [16]

    Beckmann, Timothy E

    Mark Jenkinson, Christian F. Beckmann, Timothy E. J. Behrens, Mark W. Woolrich, and Stephen M. Smith. Fsl.NeuroImage, 62(2):782–790, 2012

  17. [17]

    Laurent and P

    B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 28(5):1302–1338, 2000

  18. [18]

    Convex relaxation methods for community detection.Statistical Science, 36(1):2–15, 2021

    Xiaodong Li, Yudong Chen, and Jiaming Xu. Convex relaxation methods for community detection.Statistical Science, 36(1):2–15, 2021

  19. [19]

    Marcus, John Harwell, Trudy Olsen, Michael Hodge, Matthew F

    Daniel S. Marcus, John Harwell, Trudy Olsen, Michael Hodge, Matthew F. Glasser, Fred Prior, Mark Jenkinson, Timothy Laumann, Sandra W. Curtiss, and David C. Van Essen. Informatics and data mining: Tools and strategies for the human connectome project.Frontiers in Neuroinformatics, 5:4, 2011

  20. [20]

    Functional maps: A flexible representation of maps between shapes.ACM Transactions on Graphics (TOG), 2012

    Maks Ovsjanikov, Mirela Ben-Chen, Justin Solomon, Adrian Butscher, and Leonidas Guibas. Functional maps: A flexible representation of maps between shapes.ACM Transactions on Graphics (TOG), 2012

  21. [21]

    Sotiropoulos, Steen Moeller, Saad Jbabdi, Junqian Xu, Jesper L

    Stamatios N. Sotiropoulos, Steen Moeller, Saad Jbabdi, Junqian Xu, Jesper L. R. Andersson, Edward J. Auerbach, Essa Yacoub, David Feinberg, Kawin Setsompop, Lawrence L. Wald, Timothy E. J. Behrens, Kamil Ugurbil, and Christophe Lenglet. Effects of image reconstruction on fibre orientation mapping from multichannel diffusion mri: Reducing the noise floor u...

  22. [22]

    Van Essen, Stephen M

    David C. Van Essen, Stephen M. Smith, Deanna M. Barch, Timothy E. J. Behrens, Essa Yacoub, Kamil Ugurbil, and WU-Minn HCP Consortium. The wu-minn human connectome project: An overview.NeuroImage, 80:62–79, 2013

  23. [23]

    A robust approach to nonlinear multivariate analysis.(No Title), 1994

    Peter Verboon. A robust approach to nonlinear multivariate analysis.(No Title), 1994

  24. [24]

    Cambridge University Press, 2018

    Roman Vershynin.High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press, 2018

  25. [25]

    A least squares estimate of satellite attitude.SIAM Review, 7(3):409, 1965

    Grace Wahba. A least squares estimate of satellite attitude.SIAM Review, 7(3):409, 1965

  26. [26]

    Manifold alignment using procrustes analysis

    Chang Wang and Sridhar Mahadevan. Manifold alignment using procrustes analysis. In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 1120– 1127, 2008

  27. [27]

    A Polynomial-time Solution for Robust Registration with Extreme Outlier Rates

    Heng Yang and Luca Carlone. A polynomial-time solution for robust registration with extreme outlier rates.arXiv preprint arXiv:1903.08588, 2019

  28. [28]

    Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the davis–kahan theorem for statisticians.Biometrika, 102(2):315–323, 2014

  29. [29]

    exp ( t Z π/2 0 ∇Yθ Fu(X, Yθ), Y ′ θ dθ ) IBY X # . Applying Inequality 4, E[exp{tF u(X, Y)}I BY |X]≤E

    Juyong Zhang, Yuxin Yao, and Bailin Deng. Fast and robust iterative closest point.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3450–3466, 2021. 18 A Pre-processing on the Brain Image Data The details of the data features and pre-processing strategies are listed in Table 3. Table 3: Description of the Brain Data and Pre-processing ...