The Umeyama algorithm for matching correlated Gaussian geometric models in the low-dimensional regime
Pith reviewed 2026-05-24 03:43 UTC · model grok-4.3
The pith
The Umeyama algorithm recovers the latent permutation exactly when noise satisfies σ = o(d^{-3} n^{-2/d}).
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Given n i.i.d. pairs of correlated Gaussian vectors in R^d with noise parameter σ and the associated inner-product matrices A and B, the Umeyama algorithm recovers the hidden permutation π* exactly when σ = o(d^{-3} n^{-2/d}) and almost exactly when σ = o(d^{-3} n^{-1/d}), for d = O(log n). These thresholds approach the information limits established by prior work up to a poly(d) factor.
What carries the argument
The Umeyama algorithm, which estimates the permutation by taking the singular-value decomposition of the product of the two Gram matrices.
If this is right
- Exact recovery of the permutation is possible with the Umeyama method whenever the noise is smaller than d^{-3} n^{-2/d}.
- Almost exact recovery holds for noise up to the larger scale d^{-3} n^{-1/d}.
- The algorithm operates within a polynomial factor of the information-theoretic boundary in the low-dimensional regime.
- The same noise scalings govern success for both exact and almost exact versions of the recovery task.
Where Pith is reading between the lines
- If the noise exceeds the stated threshold by more than a constant factor, the algorithm is expected to fail even though the information-theoretic possibility may remain open.
- Similar spectral analysis may extend to other inner-product or distance-based matching problems that share the same low-dimensional scaling.
- Numerical checks with moderate n and d=log n could directly test whether the poly(d) gap can be removed.
- The result suggests that classical spectral methods remain competitive for geometric matching once dimension is not too large.
Load-bearing premise
The analysis requires that dimension d grows at most logarithmically with the number of points n and that the only correlation between the two point sets is the unknown permutation.
What would settle it
Run the Umeyama algorithm on instances with d fixed at 5, n around 2000, and σ set to 2 d^{-3} n^{-2/d}; exact recovery should fail with high probability if the claimed threshold is tight.
read the original abstract
Motivated by the problem of matching two correlated random geometric graphs, we study the problem of matching two Gaussian geometric models correlated through a latent node permutation. Specifically, given an unknown permutation $\pi^*$ on $\{1,\ldots,n\}$ and given $n$ i.i.d. pairs of correlated Gaussian vectors $\{X_{\pi^*(i)},Y_i\}$ in $\mathbb{R}^d$ with noise parameter $\sigma$, we consider two types of (correlated) weighted complete graphs with edge weights given by $A_{i,j}=\langle X_i,X_j \rangle$, $B_{i,j}=\langle Y_i,Y_j \rangle$. The goal is to recover the hidden vertex correspondence $\pi^*$ based on the observed matrices $A$ and $B$. For the low-dimensional regime where $d=O(\log n)$, Wang, Wu, Xu, and Yolou [WWXY22+] established the information thresholds for exact and almost exact recovery in matching correlated Gaussian geometric models. They also conducted numerical experiments for the classical Umeyama algorithm. In our work, we prove that this algorithm achieves exact recovery of $\pi^*$ when the noise parameter $\sigma=o(d^{-3}n^{-2/d})$, and almost exact recovery when $\sigma=o(d^{-3}n^{-1/d})$. Our results approach the information thresholds up to a $\operatorname{poly}(d)$ factor in the low-dimensional regime.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes the Umeyama algorithm for recovering a latent permutation π* that correlates two Gaussian geometric models in ℝ^d. In the low-dimensional regime d = O(log n), it proves that the algorithm achieves exact recovery of π* when the noise parameter satisfies σ = o(d^{-3} n^{-2/d}) and almost-exact recovery when σ = o(d^{-3} n^{-1/d}). These thresholds approach the information-theoretic limits established in WWXY22+ up to a poly(d) factor.
Significance. If the stated recovery guarantees hold, the work supplies the first rigorous performance analysis of the classical Umeyama algorithm under the correlated Gaussian geometric model. It is significant because it closes most of the gap between algorithmic and information-theoretic thresholds in the low-d regime while explicitly acknowledging the remaining poly(d) separation. The use of standard random-matrix and concentration tools to obtain explicit noise scalings is a strength.
minor comments (3)
- [Abstract] Abstract and §1: the citation WWXY22+ is used for the information thresholds; the bibliography entry should be completed with the full arXiv number or journal details for traceability.
- [§3] §3 (model definition): the precise joint distribution of the pairs (X_i, Y_i) is stated via correlation parameter σ, but an explicit expression for the covariance matrix of the concatenated vector would remove any ambiguity about the correlation structure.
- [Theorem 2.1] Theorem 2.1 and its proof: the o(·) notation absorbs poly(d) factors; a short remark clarifying whether the hidden constants depend on d only polynomially or exponentially would strengthen readability.
Simulated Author's Rebuttal
We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. The report lists no specific major comments, so we have no points requiring rebuttal or clarification at this stage.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper establishes recovery guarantees for the Umeyama algorithm via direct analysis of the Gaussian geometric model under the stated low-dimensional scaling. The central theorems rely on standard random matrix concentration bounds and perturbation arguments applied to the observed inner-product matrices A and B. The reference to WWXY22+ is used solely to benchmark the information-theoretic thresholds and does not supply any load-bearing step, uniqueness claim, or ansatz for the algorithmic analysis itself. No step reduces a claimed prediction to a fitted parameter, self-citation chain, or definitional equivalence; the poly(d) gap to the information threshold is explicitly noted rather than concealed. The derivation therefore stands on independent probabilistic estimates.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard sub-Gaussian concentration and random matrix bounds hold for the inner-product matrices A and B.
Forward citations
Cited by 1 Pith paper
-
Attributed Network Alignment: Statistical Limits and Efficient Algorithm
The paper characterizes exact and partial recovery thresholds in the featured correlated Gaussian Wigner model and proposes the QPAlign quadratic programming algorithm with theoretical guarantees.
Reference graph
Works this paper leans on
-
[1]
A. Armiti and M. Gertz. Geometric graph matching and similarity: A probabilistic approach. In Proceedings of the 26th International Conference on Scient ific and Statistical Database Man- agement, pages 1–12, 2014. 27
work page 2014
- [2]
-
[3]
A. Berg, T. Berg, and J. Malik. Shape matching and object recog nition using low distortion correspondences. In 2005 IEEE Computer Society Conference on Computer Vision an d Pattern Recognition (CVPR), volume 1, pages 26–33, 2005
work page 2005
-
[4]
S. N. Bernstein. On a modification of Chebyshev’s inequality and of the error formula of Laplace. In Ann. Sci. Inst. Sav. Ukraine, Sect. Math , 1(4):38–49, 1924
work page 1924
- [5]
-
[6]
T. Cour, P. Srinivasan, and J. Shi. Balanced graph matching. In Advances in Neural Infor- mation Processing Systems (NIPS) , volume 19. MIT Press, 2006
work page 2006
-
[7]
Exact alignment recovery for correlated Erd\H{o}s-R\'enyi graphs
D. Cullina and N. Kiyavash. Exact alignment recovery for correlat ed Erd˝ os-R´ enyi graphs. Preprint, arXiv:1711.06783
work page internal anchor Pith review Pith/arXiv arXiv
-
[8]
D. Cullina and N. Kiyavash. Improved achievability and converse bo unds for Erd˝ os-R´ enyi graph matching. In Proceedings of the 2016 ACM International Conference on Mea surement and Modeling of Computer Science (SIGMETRICS) , pages 63–72, New York, NY, USA, 2016. Association for Computing Machinery
work page 2016
-
[9]
D. Cullina, N. Kiyavash, P. Mittal, and H. V. Poor. Partial recover y of Erd˝ os-R´ enyi graph alignment via k-core alignment. In Proceedings of the 2016 ACM International Conference on Measurement and Modeling of Computer Science (SIGMETRICS) , pages 99–100, New York, NY, USA, 2020. Association for Computing Machinery
work page 2016
-
[10]
O. E. Dai, D. Cullina, and N. Kiyavash. Database alignment with Gau ssian features. In The 22nd International Conference on Artificial Intelligen ce and Statistics , pages 3225–3233. PMLR, 2019
work page 2019
-
[11]
O. E. Dai, D. Cullina, and N. Kiyavash. Achievability of nearly-exac t alignment for correlated Gaussian databases. In 2020 IEEE International Symposium on Information Theory (I SIT), pages 1230–1235. IEEE, 2020
work page 2020
-
[12]
O. E. Dai, D. Cullina, N. Kiyavash, and M. Grossglauser. Analysis o f a canonical labeling algorithm for the alignment of correlated Erd˝ os-R´ enyi graphs.In Proceedings of the ACM on Measurement and Analysis of Computing Systems , 3(2), jun 2019
work page 2019
-
[13]
C. Davis and W. Kahan. The rotation of eigenvectors by a pertu rbation. In SIAM Journal on Numerical Analysis, 7:1–46, 1970
work page 1970
-
[14]
J. Ding and H. Du. Detection threshold for correlated Erd˝ os- R´ enyi graphs via densest sub- graph. In IEEE Transactions on Information Theory , 69(8):5289–5298, 2023
work page 2023
-
[15]
J. Ding and H. Du. Matching recovery threshold for correlated random graphs. In Annals of Statistics, 51(4): 1718-1743, 2023
work page 2023
-
[16]
J. Ding, H. Du and S. Gong. A polynomial-time approximation schem e for the maximal overlap of two independent Erd˝ os-R´ enyi graphs. To appear inRandom Structures and Algorithms . 28
- [17]
- [18]
-
[19]
J. Ding and Z. Li. A polynomial time iterative algorithm for matching Gaussian matrices with non-vanishing correlation. Preprint, arXiv:2212.13677
-
[20]
J. Ding and Z. Li. A polynomial-time iterative algorithm for random g raph matching with non-vanishing correlation. Preprint, arXiv:2306.00266
-
[21]
J. Ding, Z. Ma, Y. Wu, and J. Xu. Efficient random graph matching via degree profiles. In Probability Theory and Related Fields , 179(1-2):29–115, 2021
work page 2021
- [22]
-
[23]
Z. Fan, C. Mao, Y. Wu, and J. Xu. Spectral graph matching and regularized quadratic relax- ations I: Algorithm and theory. In Foundations of Computational Mathematics , 23(5):1511- 1565, 2023
work page 2023
-
[24]
Z. Fan, C. Mao, Y. Wu, and J. Xu. Spectral graph matching and regularized quadratic relax- ations II: Erd˝ os-R´ enyi graphs and universality. InFoundations of Computational Mathematics , 23(5):1567-1617, 2023
work page 2023
-
[25]
S. Feizi, G. Quon, M. Medard, M. Kellis, and A. Jadbabaie. Spectr al alignment of networks. Preprint, arXiv:1602.04181
work page internal anchor Pith review Pith/arXiv arXiv
-
[26]
R. Feng, G. Tian and D. Wei. Small gaps of GOE. In Geometric and Functional Analysis , 29(6):1794–1827, 2019
work page 2019
-
[27]
L. Ganassali and L. Massouli´ e. From tree matching to sparse graph alignment. In Proceedings of Thirty Third Conference on Learning Theory (COLT) , volume 125 of Proceedings of Machine Learning Research, pages 1633–1665, 2020
work page 2020
-
[28]
L. Ganassali, L. Massouli´ e, and M. Lelarge. Impossibility of Par tial Recovery in the Graph Alignment Problem. In Proceedings of Thirty Fourth Conference on Learning Theory (COLT), volume 134 of Proceedings of Machine Learning Research , pages 2080–2102, 2021
work page 2080
-
[29]
L. Ganassali, L. Massouli´ e, and M. Lelarge. Correlation detec tion in trees for planted graph alignment. To appear in Annals of Applied Probability
-
[30]
L. Ganassali, L. Massouli´ e, and G. Semerjian. Statistical limits of correlation detection in trees. Preprint, arXiv:2209.13723
-
[31]
A. Haghighi, A. Ng, and C. Manning. Robust textual inference v ia graph matching. In Pro- ceedings of Human Language Technology Conference and Confe rence on Empirical Methods in Natural Language Processing (EMNLP), pages 387–394, Vancouver, British Columbia, Canada, Oct 2005
work page 2005
-
[32]
G. Hall and L. Massouli´ e. Partial recovery in the graph alignme nt problem. In Operations Research, 71(1):259–272, 2023. 29
work page 2023
- [33]
-
[34]
H. W. Kuhn. The Hungarian method for the assignment problem. In Naval research logistics quarterly, 2(1-2):83–97, 1955
work page 1955
-
[35]
D. Kunisky and J. N. Weed. Strong recovery of geometric plant ed matchings. In In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithm s (SODA) , pages 834–876. SIAM, 2022
work page 2022
-
[36]
V. Lyzinski, D. E. Fishkind, and C. E. Priebe. Seeded graph matc hing for correlated Erd˝ os- R´ enyi graphs. InJournal of Machine Learning Research , 15:3513–3540, 2014
work page 2014
-
[37]
C. Mao, M. Rudelson, and K. Tikhomirov. Random Graph Matching with Improved Noise Robustness. In Proceedings of Thirty Fourth Conference on Learning Theory (COLT), volume 134 of Proceedings of Machine Learning Research , pages 3296–3329, 2021
work page 2021
-
[38]
C. Mao, M. Rudelson, and K. Tikhomirov. Exact matching of rand om graphs with constant correlation. In Probability Theory and Related Fields , 186(2):327–389, 2023
work page 2023
-
[39]
C. Mao, Y. Wu, J. Xu, and S. H. Yu. Testing network correlation efficiently via counting trees. To appear in Annals of Statistics
-
[40]
C. Mao, Y. Wu, J. Xu, and S. H. Yu, Random graph matching at Ot ter’s threshold via counting chandeliers. In Proceedings of the 55th Annual ACM Symposium on Theory of Com puting (STOC), pages 1345–1356, 2023
work page 2023
-
[41]
E. Mossel and J. Xu. Seeded graph matching via large neighborh ood statistics. In Random Structures and Algorithms , 57(3):570–611, 2020
work page 2020
-
[42]
I. Nourdin, G. Peccati, and A. R´ eveillac. Multivariate normal ap proximation using Stein’s method and Malliavin calculus. In Annales de l’IHP Probabilit´ es et statistiques, 46(1):45–58, 2010
work page 2010
-
[43]
A. Narayanan and V. Shmatikov. Robust de-anonymization of la rge sparse datasets. In 2008 IEEE Symposium on Security and Privacy (ISSP) , pages 111–125, 2008
work page 2008
-
[44]
A. Narayanan and V. Shmatikov. De-anonymizing social networ ks. In 2009 30th IEEE Sym- posium on Security and Privacy (ISSP) , pages 173–187, 2009
work page 2009
-
[45]
P. Pedarsani and M. Grossglauser. On the privacy of anonymiz ed networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge D iscovery and Data Mining , pages 1235–1243, New York, NY, USA, 2011. Association for Comp uting Machinery
work page 2011
-
[46]
M. Z. Racz and A. Sridhar. Correlated randomly growing graphs . In Annals of Applied Probability, 32(2):1058–1111, 2022
work page 2022
-
[47]
M. Z. Racz and A. Sridhar. Correlated stochastic block models: Exact graph matching with applications to recovering communities. In Advances in Neural Information Processing Systems (NIPS), volume 34. Curran Associates, Inc., 2021
work page 2021
-
[48]
F. Shirani, S. Garg, and E. Erkip. Seeded graph matching: Efficie nt algorithms and theoretical guarantees. In 2017 51st Asilomar Conference on Signals, Systems, and Comp uters, pages 253–257, 2017. 30
work page 2017
- [49]
-
[50]
S. Umeyama. An eigen-decomposition approach to weighted gra ph matching problems. In IEEE Transactions on Pattern Analysis and Machine Intellig ence, 10(5):695–703, 1988
work page 1988
-
[51]
G. W. Stewart. The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators. In SIAM Journal on Numerical Analysis , 17(3):403–409, 1980
work page 1980
- [52]
-
[53]
J. T. Vogelstein, J. M. Conroy, V. Lyzinski, L. J. Podrazik, S. G . Kratzer, E. T. Harley, D. E. Fishkind, R. J. Vogelstein, and C. E. Priebe. Fast approximate quad ratic programming for graph matching. In PLOS ONE , 10(4):1–17, 2015
work page 2015
-
[54]
H. Wang, Y. Wu, J. Xu and I. Yolou. Random graph matching in geo metric models: the case of complete graphs. In Proceedings of 35th Conference on Learning Theory (COLT) , volume 178 of Proceedings of Machine Learning Research , pages 3441–3488, 2022
work page 2022
-
[55]
Y. Wu, J. Xu and S. H. Yu. Testing correlation of unlabeled rando m graphs. In Annals of Applied Probability, 33(4): 2519–2558, 2023
work page 2023
-
[56]
Y. Wu, J. Xu and S. H. Yu. Settling the sharp reconstruction th resholds of random graph matching. In IEEE Transactions on Information Theory , 68(8):5391–5417, 2022
work page 2022
-
[57]
L. Yartseva and M. Grossglauser. On the performance of per colation graph matching. In Proceedings of the First ACM Conference on Online Social Net works (COSN) , pages 119–130, New York, NY, USA, 2013. Association for Computing Machinery
work page 2013
-
[58]
Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the Da vis-Kahan theorem for statisticians. In Biometrika, 102(2):315–323, 2015. 31
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.