pith. sign in

arxiv: 2402.15095 · v2 · submitted 2024-02-23 · 🧮 math.ST · cs.DS· cs.LG· math.PR· stat.TH

The Umeyama algorithm for matching correlated Gaussian geometric models in the low-dimensional regime

Pith reviewed 2026-05-24 03:43 UTC · model grok-4.3

classification 🧮 math.ST cs.DScs.LGmath.PRstat.TH
keywords Umeyama algorithmgraph matchingGaussian geometric modelsexact recoveryalmost exact recoverylow-dimensional regimepermutation recoverycorrelated random graphs
0
0 comments X

The pith

The Umeyama algorithm recovers the latent permutation exactly when noise satisfies σ = o(d^{-3} n^{-2/d}).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies recovery of an unknown node correspondence between two sets of n correlated Gaussian vectors in dimension d. It proves that the classical Umeyama spectral algorithm succeeds at exact recovery under a noise bound that is only a polynomial factor in d away from the information-theoretic threshold, and at almost exact recovery under a slightly weaker bound. The results apply in the regime where d grows at most logarithmically with n and the vectors are drawn i.i.d. from a latent-permutation correlation model.

Core claim

Given n i.i.d. pairs of correlated Gaussian vectors in R^d with noise parameter σ and the associated inner-product matrices A and B, the Umeyama algorithm recovers the hidden permutation π* exactly when σ = o(d^{-3} n^{-2/d}) and almost exactly when σ = o(d^{-3} n^{-1/d}), for d = O(log n). These thresholds approach the information limits established by prior work up to a poly(d) factor.

What carries the argument

The Umeyama algorithm, which estimates the permutation by taking the singular-value decomposition of the product of the two Gram matrices.

If this is right

  • Exact recovery of the permutation is possible with the Umeyama method whenever the noise is smaller than d^{-3} n^{-2/d}.
  • Almost exact recovery holds for noise up to the larger scale d^{-3} n^{-1/d}.
  • The algorithm operates within a polynomial factor of the information-theoretic boundary in the low-dimensional regime.
  • The same noise scalings govern success for both exact and almost exact versions of the recovery task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the noise exceeds the stated threshold by more than a constant factor, the algorithm is expected to fail even though the information-theoretic possibility may remain open.
  • Similar spectral analysis may extend to other inner-product or distance-based matching problems that share the same low-dimensional scaling.
  • Numerical checks with moderate n and d=log n could directly test whether the poly(d) gap can be removed.
  • The result suggests that classical spectral methods remain competitive for geometric matching once dimension is not too large.

Load-bearing premise

The analysis requires that dimension d grows at most logarithmically with the number of points n and that the only correlation between the two point sets is the unknown permutation.

What would settle it

Run the Umeyama algorithm on instances with d fixed at 5, n around 2000, and σ set to 2 d^{-3} n^{-2/d}; exact recovery should fail with high probability if the claimed threshold is tight.

read the original abstract

Motivated by the problem of matching two correlated random geometric graphs, we study the problem of matching two Gaussian geometric models correlated through a latent node permutation. Specifically, given an unknown permutation $\pi^*$ on $\{1,\ldots,n\}$ and given $n$ i.i.d. pairs of correlated Gaussian vectors $\{X_{\pi^*(i)},Y_i\}$ in $\mathbb{R}^d$ with noise parameter $\sigma$, we consider two types of (correlated) weighted complete graphs with edge weights given by $A_{i,j}=\langle X_i,X_j \rangle$, $B_{i,j}=\langle Y_i,Y_j \rangle$. The goal is to recover the hidden vertex correspondence $\pi^*$ based on the observed matrices $A$ and $B$. For the low-dimensional regime where $d=O(\log n)$, Wang, Wu, Xu, and Yolou [WWXY22+] established the information thresholds for exact and almost exact recovery in matching correlated Gaussian geometric models. They also conducted numerical experiments for the classical Umeyama algorithm. In our work, we prove that this algorithm achieves exact recovery of $\pi^*$ when the noise parameter $\sigma=o(d^{-3}n^{-2/d})$, and almost exact recovery when $\sigma=o(d^{-3}n^{-1/d})$. Our results approach the information thresholds up to a $\operatorname{poly}(d)$ factor in the low-dimensional regime.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript analyzes the Umeyama algorithm for recovering a latent permutation π* that correlates two Gaussian geometric models in ℝ^d. In the low-dimensional regime d = O(log n), it proves that the algorithm achieves exact recovery of π* when the noise parameter satisfies σ = o(d^{-3} n^{-2/d}) and almost-exact recovery when σ = o(d^{-3} n^{-1/d}). These thresholds approach the information-theoretic limits established in WWXY22+ up to a poly(d) factor.

Significance. If the stated recovery guarantees hold, the work supplies the first rigorous performance analysis of the classical Umeyama algorithm under the correlated Gaussian geometric model. It is significant because it closes most of the gap between algorithmic and information-theoretic thresholds in the low-d regime while explicitly acknowledging the remaining poly(d) separation. The use of standard random-matrix and concentration tools to obtain explicit noise scalings is a strength.

minor comments (3)
  1. [Abstract] Abstract and §1: the citation WWXY22+ is used for the information thresholds; the bibliography entry should be completed with the full arXiv number or journal details for traceability.
  2. [§3] §3 (model definition): the precise joint distribution of the pairs (X_i, Y_i) is stated via correlation parameter σ, but an explicit expression for the covariance matrix of the concatenated vector would remove any ambiguity about the correlation structure.
  3. [Theorem 2.1] Theorem 2.1 and its proof: the o(·) notation absorbs poly(d) factors; a short remark clarifying whether the hidden constants depend on d only polynomially or exponentially would strengthen readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. The report lists no specific major comments, so we have no points requiring rebuttal or clarification at this stage.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper establishes recovery guarantees for the Umeyama algorithm via direct analysis of the Gaussian geometric model under the stated low-dimensional scaling. The central theorems rely on standard random matrix concentration bounds and perturbation arguments applied to the observed inner-product matrices A and B. The reference to WWXY22+ is used solely to benchmark the information-theoretic thresholds and does not supply any load-bearing step, uniqueness claim, or ansatz for the algorithmic analysis itself. No step reduces a claimed prediction to a fitted parameter, self-citation chain, or definitional equivalence; the poly(d) gap to the information threshold is explicitly noted rather than concealed. The derivation therefore stands on independent probabilistic estimates.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only: relies on standard concentration inequalities for Gaussians and random matrices; no free parameters or invented entities visible.

axioms (1)
  • standard math Standard sub-Gaussian concentration and random matrix bounds hold for the inner-product matrices A and B.
    Invoked implicitly to control the perturbation between the observed matrices and the ideal permutation-aligned version.

pith-pipeline@v0.9.0 · 5792 in / 1241 out tokens · 23587 ms · 2026-05-24T03:43:58.789146+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Attributed Network Alignment: Statistical Limits and Efficient Algorithm

    math.ST 2026-04 unverdicted novelty 7.0

    The paper characterizes exact and partial recovery thresholds in the featured correlated Gaussian Wigner model and proposes the QPAlign quadratic programming algorithm with theoretical guarantees.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Armiti and M

    A. Armiti and M. Gertz. Geometric graph matching and similarity: A probabilistic approach. In Proceedings of the 26th International Conference on Scient ific and Statistical Database Man- agement, pages 1–12, 2014. 27

  2. [2]

    Barak, C

    B. Barak, C. N. Chou, Z. Lei, T. Schramm, and Y. Sheng. (Nearly ) efficient algorithms for the graph matching problem on correlated random graphs. In Advances in Neural Information Processing Systems (NIPS) , volume 32. Curran Associates, Inc., 2019

  3. [3]

    A. Berg, T. Berg, and J. Malik. Shape matching and object recog nition using low distortion correspondences. In 2005 IEEE Computer Society Conference on Computer Vision an d Pattern Recognition (CVPR), volume 1, pages 26–33, 2005

  4. [4]

    S. N. Bernstein. On a modification of Chebyshev’s inequality and of the error formula of Laplace. In Ann. Sci. Inst. Sav. Ukraine, Sect. Math , 1(4):38–49, 1924

  5. [5]

    Bozorg, S

    M. Bozorg, S. Salehkaleybar, and M. Hashemi. Seedless graph ma tching via tail of degree distribution for correlated Erd˝ os-R´ enyi graphs. Preprint,arXiv:1907.06334

  6. [6]

    T. Cour, P. Srinivasan, and J. Shi. Balanced graph matching. In Advances in Neural Infor- mation Processing Systems (NIPS) , volume 19. MIT Press, 2006

  7. [7]

    Exact alignment recovery for correlated Erd\H{o}s-R\'enyi graphs

    D. Cullina and N. Kiyavash. Exact alignment recovery for correlat ed Erd˝ os-R´ enyi graphs. Preprint, arXiv:1711.06783

  8. [8]

    Cullina and N

    D. Cullina and N. Kiyavash. Improved achievability and converse bo unds for Erd˝ os-R´ enyi graph matching. In Proceedings of the 2016 ACM International Conference on Mea surement and Modeling of Computer Science (SIGMETRICS) , pages 63–72, New York, NY, USA, 2016. Association for Computing Machinery

  9. [9]

    Cullina, N

    D. Cullina, N. Kiyavash, P. Mittal, and H. V. Poor. Partial recover y of Erd˝ os-R´ enyi graph alignment via k-core alignment. In Proceedings of the 2016 ACM International Conference on Measurement and Modeling of Computer Science (SIGMETRICS) , pages 99–100, New York, NY, USA, 2020. Association for Computing Machinery

  10. [10]

    O. E. Dai, D. Cullina, and N. Kiyavash. Database alignment with Gau ssian features. In The 22nd International Conference on Artificial Intelligen ce and Statistics , pages 3225–3233. PMLR, 2019

  11. [11]

    O. E. Dai, D. Cullina, and N. Kiyavash. Achievability of nearly-exac t alignment for correlated Gaussian databases. In 2020 IEEE International Symposium on Information Theory (I SIT), pages 1230–1235. IEEE, 2020

  12. [12]

    O. E. Dai, D. Cullina, N. Kiyavash, and M. Grossglauser. Analysis o f a canonical labeling algorithm for the alignment of correlated Erd˝ os-R´ enyi graphs.In Proceedings of the ACM on Measurement and Analysis of Computing Systems , 3(2), jun 2019

  13. [13]

    Davis and W

    C. Davis and W. Kahan. The rotation of eigenvectors by a pertu rbation. In SIAM Journal on Numerical Analysis, 7:1–46, 1970

  14. [14]

    Ding and H

    J. Ding and H. Du. Detection threshold for correlated Erd˝ os- R´ enyi graphs via densest sub- graph. In IEEE Transactions on Information Theory , 69(8):5289–5298, 2023

  15. [15]

    Ding and H

    J. Ding and H. Du. Matching recovery threshold for correlated random graphs. In Annals of Statistics, 51(4): 1718-1743, 2023

  16. [16]

    J. Ding, H. Du and S. Gong. A polynomial-time approximation schem e for the maximal overlap of two independent Erd˝ os-R´ enyi graphs. To appear inRandom Structures and Algorithms . 28

  17. [17]

    J. Ding, H. Du and Z. Li. Low-Degree Hardness of Detection for Correlated Erd˝ os-R´ enyi Graphs. Preprint, arXiv: 2311.15931

  18. [18]

    J. Ding, Y. Fei and Y. Wang. Efficiently matching random inhomoge neous graphs via degree profiles. Preprint, arXiv: 2310.10441

  19. [19]

    Ding and Z

    J. Ding and Z. Li. A polynomial time iterative algorithm for matching Gaussian matrices with non-vanishing correlation. Preprint, arXiv:2212.13677

  20. [20]

    Ding and Z

    J. Ding and Z. Li. A polynomial-time iterative algorithm for random g raph matching with non-vanishing correlation. Preprint, arXiv:2306.00266

  21. [21]

    J. Ding, Z. Ma, Y. Wu, and J. Xu. Efficient random graph matching via degree profiles. In Probability Theory and Related Fields , 179(1-2):29–115, 2021

  22. [22]

    H. Du, S. Gong and R. Huang. The Algorithmic Phase Transition of Random Graph Alignment Problem. Preprint, arXiv:2307.06590

  23. [23]

    Z. Fan, C. Mao, Y. Wu, and J. Xu. Spectral graph matching and regularized quadratic relax- ations I: Algorithm and theory. In Foundations of Computational Mathematics , 23(5):1511- 1565, 2023

  24. [24]

    Z. Fan, C. Mao, Y. Wu, and J. Xu. Spectral graph matching and regularized quadratic relax- ations II: Erd˝ os-R´ enyi graphs and universality. InFoundations of Computational Mathematics , 23(5):1567-1617, 2023

  25. [25]

    Spectral Alignment of Graphs

    S. Feizi, G. Quon, M. Medard, M. Kellis, and A. Jadbabaie. Spectr al alignment of networks. Preprint, arXiv:1602.04181

  26. [26]

    R. Feng, G. Tian and D. Wei. Small gaps of GOE. In Geometric and Functional Analysis , 29(6):1794–1827, 2019

  27. [27]

    Ganassali and L

    L. Ganassali and L. Massouli´ e. From tree matching to sparse graph alignment. In Proceedings of Thirty Third Conference on Learning Theory (COLT) , volume 125 of Proceedings of Machine Learning Research, pages 1633–1665, 2020

  28. [28]

    Ganassali, L

    L. Ganassali, L. Massouli´ e, and M. Lelarge. Impossibility of Par tial Recovery in the Graph Alignment Problem. In Proceedings of Thirty Fourth Conference on Learning Theory (COLT), volume 134 of Proceedings of Machine Learning Research , pages 2080–2102, 2021

  29. [29]

    Ganassali, L

    L. Ganassali, L. Massouli´ e, and M. Lelarge. Correlation detec tion in trees for planted graph alignment. To appear in Annals of Applied Probability

  30. [30]

    Ganassali, L

    L. Ganassali, L. Massouli´ e, and G. Semerjian. Statistical limits of correlation detection in trees. Preprint, arXiv:2209.13723

  31. [31]

    Haghighi, A

    A. Haghighi, A. Ng, and C. Manning. Robust textual inference v ia graph matching. In Pro- ceedings of Human Language Technology Conference and Confe rence on Empirical Methods in Natural Language Processing (EMNLP), pages 387–394, Vancouver, British Columbia, Canada, Oct 2005

  32. [32]

    Hall and L

    G. Hall and L. Massouli´ e. Partial recovery in the graph alignme nt problem. In Operations Research, 71(1):259–272, 2023. 29

  33. [33]

    Kazemi, S

    E. Kazemi, S. H. Hassani, and M. Grossglauser. Growing a graph matching from a handful of seeds. In Proc. VLDB Endow. , 8(10):1010–1021, jun 2015

  34. [34]

    H. W. Kuhn. The Hungarian method for the assignment problem. In Naval research logistics quarterly, 2(1-2):83–97, 1955

  35. [35]

    Kunisky and J

    D. Kunisky and J. N. Weed. Strong recovery of geometric plant ed matchings. In In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithm s (SODA) , pages 834–876. SIAM, 2022

  36. [36]

    Lyzinski, D

    V. Lyzinski, D. E. Fishkind, and C. E. Priebe. Seeded graph matc hing for correlated Erd˝ os- R´ enyi graphs. InJournal of Machine Learning Research , 15:3513–3540, 2014

  37. [37]

    C. Mao, M. Rudelson, and K. Tikhomirov. Random Graph Matching with Improved Noise Robustness. In Proceedings of Thirty Fourth Conference on Learning Theory (COLT), volume 134 of Proceedings of Machine Learning Research , pages 3296–3329, 2021

  38. [38]

    C. Mao, M. Rudelson, and K. Tikhomirov. Exact matching of rand om graphs with constant correlation. In Probability Theory and Related Fields , 186(2):327–389, 2023

  39. [39]

    C. Mao, Y. Wu, J. Xu, and S. H. Yu. Testing network correlation efficiently via counting trees. To appear in Annals of Statistics

  40. [40]

    C. Mao, Y. Wu, J. Xu, and S. H. Yu, Random graph matching at Ot ter’s threshold via counting chandeliers. In Proceedings of the 55th Annual ACM Symposium on Theory of Com puting (STOC), pages 1345–1356, 2023

  41. [41]

    Mossel and J

    E. Mossel and J. Xu. Seeded graph matching via large neighborh ood statistics. In Random Structures and Algorithms , 57(3):570–611, 2020

  42. [42]

    Nourdin, G

    I. Nourdin, G. Peccati, and A. R´ eveillac. Multivariate normal ap proximation using Stein’s method and Malliavin calculus. In Annales de l’IHP Probabilit´ es et statistiques, 46(1):45–58, 2010

  43. [43]

    Narayanan and V

    A. Narayanan and V. Shmatikov. Robust de-anonymization of la rge sparse datasets. In 2008 IEEE Symposium on Security and Privacy (ISSP) , pages 111–125, 2008

  44. [44]

    Narayanan and V

    A. Narayanan and V. Shmatikov. De-anonymizing social networ ks. In 2009 30th IEEE Sym- posium on Security and Privacy (ISSP) , pages 173–187, 2009

  45. [45]

    Pedarsani and M

    P. Pedarsani and M. Grossglauser. On the privacy of anonymiz ed networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge D iscovery and Data Mining , pages 1235–1243, New York, NY, USA, 2011. Association for Comp uting Machinery

  46. [46]

    M. Z. Racz and A. Sridhar. Correlated randomly growing graphs . In Annals of Applied Probability, 32(2):1058–1111, 2022

  47. [47]

    M. Z. Racz and A. Sridhar. Correlated stochastic block models: Exact graph matching with applications to recovering communities. In Advances in Neural Information Processing Systems (NIPS), volume 34. Curran Associates, Inc., 2021

  48. [48]

    Shirani, S

    F. Shirani, S. Garg, and E. Erkip. Seeded graph matching: Efficie nt algorithms and theoretical guarantees. In 2017 51st Asilomar Conference on Signals, Systems, and Comp uters, pages 253–257, 2017. 30

  49. [49]

    Singh, J

    R. Singh, J. Xu, and B. Berger. Global alignment of multiple prote in interaction networks with application to functional orthology detection. In Proceedings of the National Academy of Sciences of the United States of America , 105:12763–12768, 10 2008

  50. [50]

    S. Umeyama. An eigen-decomposition approach to weighted gra ph matching problems. In IEEE Transactions on Pattern Analysis and Machine Intellig ence, 10(5):695–703, 1988

  51. [51]

    G. W. Stewart. The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators. In SIAM Journal on Numerical Analysis , 17(3):403–409, 1980

  52. [52]

    Vershynin

    R. Vershynin. High-dimensional probability: An introduction with applications in data science. Cambridge university press, 2018

  53. [53]

    J. T. Vogelstein, J. M. Conroy, V. Lyzinski, L. J. Podrazik, S. G . Kratzer, E. T. Harley, D. E. Fishkind, R. J. Vogelstein, and C. E. Priebe. Fast approximate quad ratic programming for graph matching. In PLOS ONE , 10(4):1–17, 2015

  54. [54]

    H. Wang, Y. Wu, J. Xu and I. Yolou. Random graph matching in geo metric models: the case of complete graphs. In Proceedings of 35th Conference on Learning Theory (COLT) , volume 178 of Proceedings of Machine Learning Research , pages 3441–3488, 2022

  55. [55]

    Y. Wu, J. Xu and S. H. Yu. Testing correlation of unlabeled rando m graphs. In Annals of Applied Probability, 33(4): 2519–2558, 2023

  56. [56]

    Y. Wu, J. Xu and S. H. Yu. Settling the sharp reconstruction th resholds of random graph matching. In IEEE Transactions on Information Theory , 68(8):5391–5417, 2022

  57. [57]

    Yartseva and M

    L. Yartseva and M. Grossglauser. On the performance of per colation graph matching. In Proceedings of the First ACM Conference on Online Social Net works (COSN) , pages 119–130, New York, NY, USA, 2013. Association for Computing Machinery

  58. [58]

    Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the Da vis-Kahan theorem for statisticians. In Biometrika, 102(2):315–323, 2015. 31