Recovering Wasserstein Distance Matrices from Few Measurements
Pith reviewed 2026-05-21 21:58 UTC · model grok-4.3
The pith
Nyström completion recovers Wasserstein distance matrices from O(d log d) columns while preserving MDS stability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Nyström completion, by sampling and using only O(d log d) columns of the Wasserstein distance matrix, produces an approximation that keeps multidimensional scaling stable; the resulting low-dimensional embeddings support reliable downstream classification on OrganCMNIST data even when just 10 percent of the columns are available.
What carries the argument
Nyström completion on Wasserstein distance matrices, which reconstructs the full matrix and its MDS embedding from a small set of sampled columns.
If this is right
- Manifold-learning pipelines that rely on Wasserstein distances can run with far fewer pairwise computations.
- Nyström sampling outperforms triangular matrix completion when the computational budget is fixed.
- Classification performance on medical-image benchmarks remains stable under this reduced sampling regime.
- Stability proofs for MDS extend directly to the Nyström-approximated Wasserstein matrices.
Where Pith is reading between the lines
- The same column-sampling idea could be tested on other optimal-transport distances beyond Wasserstein.
- If the low-rank structure is common, the method might scale Wasserstein-based learning to much larger data sets.
- Combining Nyström sampling with faster approximate Wasserstein solvers could reduce cost even further.
Load-bearing premise
The Wasserstein distance matrix admits a sufficiently accurate low-rank or column-space approximation that can be recovered from O(d log d) sampled columns.
What would settle it
An experiment on a dataset where reducing to 10 percent of the columns causes a large increase in MDS stress or a clear drop in classification accuracy.
Figures
read the original abstract
This paper proposes two algorithms for estimating square Wasserstein distance matrices from a small number of entries. These matrices are used to compute manifold learning embeddings like multidimensional scaling (MDS) or Isomap, but contrary to Euclidean distance matrices, are extremely costly to compute. We analyze matrix completion from upper triangular samples and Nystr\"{o}m completion in which $\mathcal{O}(d\log(d))$ columns of the distance matrices are computed where $d$ is the desired embedding dimension, prove stability of MDS under Nystr\"{o}m completion, and show that it can outperform matrix completion for a fixed budget of sample distances. Finally, we show that classification of the OrganCMNIST dataset from the MedMNIST benchmark is stable on data embedded from the Nystr\"{o}m estimation of the distance matrix even when only 10\% of the columns are computed.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes two algorithms for recovering square Wasserstein distance matrices from few entries: one based on matrix completion from upper-triangular samples and another using Nyström completion that computes only O(d log d) columns, where d is the target embedding dimension. It proves stability of multidimensional scaling (MDS) embeddings under the Nyström approach, shows that Nyström can outperform standard matrix completion for a fixed budget of distance computations, and reports that classification accuracy on the OrganCMNIST dataset remains stable when embeddings are obtained from a Nyström estimate using only 10% of the columns.
Significance. If the stability result and empirical claims hold, the work offers a practical route to scaling manifold learning methods that rely on Wasserstein distances, which are otherwise prohibitively expensive to compute in full. The combination of a stated proof for MDS stability together with reproducible downstream classification experiments on a public benchmark constitutes a concrete advance for optimal-transport-based dimensionality reduction.
major comments (2)
- [Section 4] The stability proof for MDS under Nyström completion (Section 4) and the claim that O(d log d) columns suffice both presuppose that the column space of the Wasserstein distance matrix is captured to sufficient accuracy by the sampled columns. No independent diagnostic—such as the decay rate of singular values of the full distance matrix or the Frobenius reconstruction error on held-out entries—is reported to verify this modeling assumption before the stability argument is invoked.
- [Experimental section / Table 2] Table 2 (or the corresponding experimental table) shows classification accuracy on OrganCMNIST remains within a few percent of the full-matrix baseline at 10% columns, yet this downstream metric does not directly test the MDS embedding distortion that the stability theorem bounds. A direct comparison of embedding stress or Procrustes error between the Nyström and full embeddings would be needed to confirm that the observed classifier stability is not merely an artifact of the particular linear head.
minor comments (2)
- [Section 3] Notation for the number of sampled columns is introduced as O(d log d) in the abstract but should be stated precisely (with the hidden constants) when the Nyström procedure is defined.
- [Section 4] The manuscript would benefit from an explicit statement of the precise assumptions on the Wasserstein kernel (e.g., boundedness or Lipschitz conditions) that are used to obtain the stability guarantee.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, indicating planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Section 4] The stability proof for MDS under Nyström completion (Section 4) and the claim that O(d log d) columns suffice both presuppose that the column space of the Wasserstein distance matrix is captured to sufficient accuracy by the sampled columns. No independent diagnosticâsuch as the decay rate of singular values of the full distance matrix or the Frobenius reconstruction error on held-out entriesâis reported to verify this modeling assumption before the stability argument is invoked.
Authors: We thank the referee for highlighting this point. The stability result in Section 4 is stated under the modeling assumption that the selected columns provide a sufficiently accurate basis for the column space of the distance matrix. We agree that empirical verification of this assumption would strengthen the presentation. In the revised manuscript we will add (i) plots of the singular-value decay of the full Wasserstein distance matrices for all datasets used in the experiments and (ii) the Frobenius reconstruction error of the Nyström approximation evaluated on a held-out set of entries. These diagnostics will be placed in a new subsection of Section 4 immediately preceding the stability theorem. revision: yes
-
Referee: [Experimental section / Table 2] Table 2 (or the corresponding experimental table) shows classification accuracy on OrganCMNIST remains within a few percent of the full-matrix baseline at 10% columns, yet this downstream metric does not directly test the MDS embedding distortion that the stability theorem bounds. A direct comparison of embedding stress or Procrustes error between the Nyström and full embeddings would be needed to confirm that the observed classifier stability is not merely an artifact of the particular linear head.
Authors: We agree that a direct quantification of embedding distortion provides a more immediate test of the stability theorem than the downstream classification task alone. In the revised experimental section we will augment Table 2 (and the associated figures) with two additional metrics: (i) the classical MDS stress value computed on the full versus Nyström embeddings and (ii) the Procrustes distance between the two embeddings, both reported as functions of the column budget. These quantities will be computed on the same OrganCMNIST splits used for classification, allowing readers to relate embedding fidelity directly to the observed classifier stability. revision: yes
Circularity Check
No circularity: stability proof and empirical claims rest on independent mathematical analysis and separate dataset evaluation
full rationale
The paper's core claims rest on two algorithms for distance matrix recovery, a stated stability proof for MDS under Nyström sampling of O(d log d) columns, and downstream classification experiments on OrganCMNIST using only 10% columns. No load-bearing step reduces by construction to fitted parameters or self-citations; the low-rank column-space modeling assumption is used for the proof but is not derived from the target results themselves, and the empirical validation uses held-out classification performance rather than reconstruction error on the same samples. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Wasserstein distance matrices admit accurate approximation via Nyström sampling of O(d log d) columns
Reference graph
Works this paper leans on
-
[1]
Nanning Zheng and Jianru Xue,Statistical learning and pattern analysis for image and video processing, Springer Science & Business Media, 2009. 7
work page 2009
-
[2]
Manifold learning: What, how, and why,
Marina Meil˘ a and Hanyu Zhang, “Manifold learning: What, how, and why,”Annual Review of Statistics and Its Application, vol. 11, no. 1, pp. 393–417, 2024
work page 2024
-
[3]
Molecular phenotyping using networks, diffusion, and topol- ogy: soft-tissue sarcoma,
James Mathews, Maryam Pouryahya, Caroline Moosm¨ uller, Ioannis G. Kevrekidis, Joseph O. Deasy, and Allen Tannenbaum, “Molecular phenotyping using networks, diffusion, and topol- ogy: soft-tissue sarcoma,”Scientific Reports, vol. 9, 2019, Article number: 13982
work page 2019
-
[4]
From word embeddings to document distances,
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger, “From word embeddings to document distances,” inInternational conference on machine learning. PMLR, 2015, pp. 957–966
work page 2015
-
[5]
An optimal transportation approach for nuclear structure-based pathology,
Wei Wang, John A Ozolek, Dejan Slepˇ cev, Ann B Lee, Cheng Chen, and Gustavo K Rohde, “An optimal transportation approach for nuclear structure-based pathology,”IEEE transac- tions on medical imaging, vol. 30, no. 3, pp. 621–631, 2010
work page 2010
-
[6]
Computational optimal transport: with applications to data science,
Gabriel Peyr´ e and Marco Cuturi, “Computational optimal transport: with applications to data science,”Foundations and Trends in Machine Learning, vol. 11, no. 5-6, pp. 355–607, 2019
work page 2019
-
[7]
Wassmap: Wasserstein isometric mapping for image manifold learning,
Keaton Hamm, Nick Henscheid, and Shujie Kang, “Wassmap: Wasserstein isometric mapping for image manifold learning,”SIAM Journal on Mathematics of Data Science, vol. 5, no. 2, pp. 475–501, 2023
work page 2023
-
[8]
A survey of matrix completion methods for recommendation systems,
Andy Ramlatchan, Mengyun Yang, Quan Liu, Min Li, Jianxin Wang, and Yaohang Li, “A survey of matrix completion methods for recommendation systems,”Big Data Mining and Analytics, vol. 1, no. 4, pp. 308–323, 2018
work page 2018
-
[9]
Rank minimiza- tion via online learning,
Raghu Meka, Prateek Jain, Constantine Caramanis, and Inderjit S Dhillon, “Rank minimiza- tion via online learning,” inProceedings of the 25th International Conference on Machine learning, 2008, pp. 656–663
work page 2008
-
[10]
Emmanuel J Cand` es, Justin Romberg, and Terence Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,”IEEE Transactions on information theory, vol. 52, no. 2, pp. 489–509, 2006
work page 2006
-
[11]
Nonconvex optimization meets low-rank matrix factorization: An overview,
Yuejie Chi, Yue M Lu, and Yuxin Chen, “Nonconvex optimization meets low-rank matrix factorization: An overview,”IEEE Transactions on Signal Processing, vol. 67, no. 20, pp. 5239–5269, 2019
work page 2019
-
[12]
Euclidean distance geometry and applications,
Leo Liberti, Carlile Lavor, Nelson Maculan, and Antonio Mucherino, “Euclidean distance geometry and applications,”SIAM review, vol. 56, no. 1, pp. 3–69, 2014
work page 2014
-
[13]
Nilson JM Moreira, Leonardo T Duarte, Carlile Lavor, and Cristiano Torezzan, “A novel low- rank matrix completion approach to estimate missing entries in euclidean distance matrix,” Computational and Applied Mathematics, vol. 37, no. 4, pp. 4989–4999, 2018
work page 2018
-
[14]
Exact reconstruction of euclidean distance geometry problem using low-rank matrix completion,
Abiy Tasissa and Rongjie Lai, “Exact reconstruction of euclidean distance geometry problem using low-rank matrix completion,”IEEE Transactions on Information Theory, vol. 65, no. 5, pp. 3124–3144, 2018
work page 2018
-
[15]
Using the nystr¨ om method to speed up kernel machines,
Christopher Williams and Matthias Seeger, “Using the nystr¨ om method to speed up kernel machines,”Advances in neural information processing systems, vol. 13, 2000. 8
work page 2000
-
[16]
Supervised learning of sheared distributions using linearized optimal transport,
Varun Khurana, Harish Kannan, Alexander Cloninger, and Caroline Moosm¨ uller, “Supervised learning of sheared distributions using linearized optimal transport,”Sampling Theory, Signal Processing, and Data Analysis, vol. 21, no. 1, 2023
work page 2023
-
[17]
Linearized wasserstein dimensionality reduction with approximation guarantees,
Alexander Cloninger, Keaton Hamm, Varun Khurana, and Caroline Moosm¨ uller, “Linearized wasserstein dimensionality reduction with approximation guarantees,”Applied and Computa- tional Harmonic Analysis, vol. 74, pp. 101718, 2025
work page 2025
-
[18]
Matrix Coherence and the Nystrom Method
Ameet Talwalkar and Afshin Rostamizadeh, “Matrix coherence and the nystrom method,” arXiv preprint arXiv:1408.2044, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2044
-
[19]
Perturbations of cur decompositions,
Keaton Hamm and Longxiu Huang, “Perturbations of cur decompositions,”SIAM Journal on Matrix Analysis and Applications, vol. 42, no. 1, pp. 351–375, 2021
work page 2021
-
[20]
Improved analysis of the subsampled randomized hadamard transform,
Joel A Tropp, “Improved analysis of the subsampled randomized hadamard transform,”Ad- vances in Adaptive Data Analysis, vol. 3, no. 01n02, pp. 115–126, 2011
work page 2011
-
[21]
Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,
Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,”Scientific Data, vol. 10, no. 1, pp. 41, 2023
work page 2023
-
[22]
Matrix completion with cross- concentrated sampling: Bridging uniform sampling and cur sampling,
HanQin Cai, Longxiu Huang, Pengyu Li, and Deanna Needell, “Matrix completion with cross- concentrated sampling: Bridging uniform sampling and cur sampling,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 8, pp. 10100–10113, 2023. 9
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.