pith. machine review for the scientific record. sign in

arxiv: 2605.03346 · v1 · submitted 2026-05-05 · 💻 cs.DS · cs.LG

Recognition: unknown

Provable Accuracy Collapse in Embedding-Based Representations under Dimensionality Mismatch

Dionysis Arvanitakis, Vaggos Chatziafratis, Yiyuan Luo

Pith reviewed 2026-05-07 14:22 UTC · model grok-4.3

classification 💻 cs.DS cs.LG
keywords dimensionality mismatchcontrastive learningtriplet constraintsembedding accuracyinformation-theoretic boundsUnique Games Conjecture
0
0 comments X

The pith

Embeddings whose dimension is a constant fraction below the ground-truth dimension must violate at least half of all given triplet constraints, no matter how they are chosen.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a sharp information-theoretic barrier: when distance-comparison triplets are exactly realizable by some unknown embedding in D Euclidean dimensions, any representation using dimension at most cD for a fixed c less than 1 is forced to violate at least 50 percent of those triplets. This bound matches the accuracy of the trivial one-dimensional solution that simply ignores the data. The result applies directly to standard contrastive-learning settings that rely on anchor-positive-negative triplets. A complementary hardness theorem shows that, even when the triplets are almost realizable in one dimension, no polynomial-time algorithm can exceed the 50 percent baseline under the Unique Games Conjecture.

Core claim

Given a set of triplets realizable by an unknown ground-truth embedding in D dimensions, every embedding of dimension at most cD for some constant c less than 1 violates at least half of the triplets. This forces accuracy no higher than that of a trivial one-dimensional embedding that disregards the input entirely. The same collapse occurs even under computational restrictions: under the Unique Games Conjecture, no polynomial-time algorithm, irrespective of its output dimension, can achieve accuracy above 50 percent when the triplets are only nearly realizable in one dimension.

What carries the argument

The information-theoretic collapse bound that shows low-dimensional embeddings cannot preserve more than half the triplet distance comparisons supplied by a higher-dimensional realization.

If this is right

  • Any practical embedding algorithm must use dimension linearly close to the unknown ground-truth dimension or else accept at least 50 percent triplet violations.
  • The collapse holds even when the algorithm is allowed arbitrary computational resources and can choose its own dimension freely.
  • The same 50 percent barrier applies to nearly realizable one-dimensional data under standard complexity assumptions.
  • Dimensionality reduction below the linear threshold cannot be rescued by better optimization or larger models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners who compress embeddings for speed or memory should expect a hard accuracy cliff rather than a gradual degradation once dimension falls below the linear threshold.
  • The result suggests that dimensionality-estimation techniques become necessary before applying contrastive losses, because guessing too low triggers the collapse.
  • Extensions to approximate or noisy triplets would require new proof techniques, but the exact case already rules out many current low-dimensional pipelines.

Load-bearing premise

The triplets are produced exactly by some unknown Euclidean embedding of dimension precisely D.

What would settle it

Construct a large set of triplets exactly realizable in D dimensions, then exhibit a concrete embedding of dimension cD that satisfies strictly more than half of them.

Figures

Figures reproduced from arXiv: 2605.03346 by Dionysis Arvanitakis, Vaggos Chatziafratis, Yiyuan Luo.

Figure 1
Figure 1. Figure 1: Ground-truth Euclidean embeddings: triplet accuracy under various ground-truth dimensions D and embedding dimen￾sions d, with n = 1000 points and m = 106 triplets. Top: unconstrained embeddings. Bottom: spherical embeddings. 512 256 128 64 32 16 8 4 2 Embed Dim 0.5 0.6 0.7 0.8 0.9 1.0 Accuracy unconstrained spherical view at source ↗
Figure 2
Figure 2. Figure 2: Uniformly random triplets: triplet accuracy under various embedding dimensions d. We fix the number of points n = 4000 and the number of triplets m = 106 . for accuracy: even when triplet constraints are perfectly realizable in high dimension, reducing the embedding di￾mension below a constant fraction of the ground-truth di￾mension can force accuracy to collapse down to the trivial 1 2 -baseline. We furth… view at source ↗
read the original abstract

Embedding-based representations in Euclidean space $\mathbb{R}^d$ are a cornerstone of modern machine learning, where a major goal is to use the \emph{smallest dimension} that faithfully captures data relations. In this work, we prove sharp dimension--accuracy tradeoffs and identify a fundamental information-theoretic limitation: unless the embedding dimension $d$ is chosen close to the ground-truth dimension $D$, accuracy undergoes a sudden collapse. Our main result shows that this phenomenon arises even in standard contrastive learning settings, where supervision is limited to a set of $m$ anchor--positive--negative triplets $(i,j,k)$ encoding distance comparisons $\mathrm{dist}(i,j) < \mathrm{dist}(i,k)$. Specifically, given triplets realizable by an unknown ground-truth embedding in $D$ dimensions, we prove that there exists constant $c < 1$, such that \emph{every embedding of dimension at most $cD$ violates half of the triplets}, yielding accuracy as low as a trivial one-dimensional solution that ignores the input. We complement our information-theoretic bounds with strong computational hardness results: under the Unique Games Conjecture, even if the given triplets are nearly realizable in $D=1$ dimension, no polynomial-time algorithm -- \textit{regardless of its dimension} -- can achieve accuracy above the trivial $50\%$ baseline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proves information-theoretic lower bounds for contrastive embedding learning: given triplets exactly realizable by a ground-truth Euclidean embedding in D dimensions, there exists a fixed constant c < 1 such that every embedding of dimension at most cD must violate at least half the triplets (accuracy no better than the trivial 50% baseline). It complements this with UGC-based computational hardness showing that, even for triplets nearly realizable in D=1, no polynomial-time algorithm (regardless of embedding dimension) can exceed 50% accuracy.

Significance. If the central claims hold, the work identifies a sharp, dimension-dependent phase transition in achievable accuracy for triplet-based embeddings, with potential implications for contrastive learning and dimensionality selection in ML. The combination of counting arguments and UGC reductions provides a clean theoretical separation between information-theoretic and computational barriers.

major comments (2)
  1. [Main theorem and information-theoretic section] The central collapse bound is conditioned on exact realizability of all triplets by some configuration in precisely D Euclidean dimensions. The manuscript should explicitly address the robustness of the result when this assumption is relaxed to approximate satisfaction or when the minimal embedding dimension is strictly less than D, as these are the regimes encountered in practice and could invalidate the drop to the 50% baseline (see the statement of the main theorem and the discussion of the information-theoretic argument).
  2. [Proof of the information-theoretic bound] The derivation of the constant c and the specific counting or rigidity argument establishing the half-violation threshold are not fully detailed in the provided text. A proof sketch or key lemma (e.g., the volume or linear-algebraic step that forces the accuracy drop) is needed to verify that the bound is independent of the particular ground-truth configuration.
minor comments (2)
  1. [Abstract and hardness section] The abstract and introduction should clarify whether the UGC hardness applies to the exact or approximate version of the triplet realization problem and state the precise approximation factor used in the reduction.
  2. [Preliminaries] Notation for the triplet set size m and the ground-truth dimension D is introduced without an explicit comparison to the number of degrees of freedom in an embedding of dimension d; adding a short remark on this would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. The comments highlight important aspects of the assumptions and presentation of our information-theoretic results. We address each major comment below and have prepared revisions to the manuscript accordingly.

read point-by-point responses
  1. Referee: [Main theorem and information-theoretic section] The central collapse bound is conditioned on exact realizability of all triplets by some configuration in precisely D Euclidean dimensions. The manuscript should explicitly address the robustness of the result when this assumption is relaxed to approximate satisfaction or when the minimal embedding dimension is strictly less than D, as these are the regimes encountered in practice and could invalidate the drop to the 50% baseline (see the statement of the main theorem and the discussion of the information-theoretic argument).

    Authors: We agree that the main theorem is stated under the assumption of exact realizability by a configuration that requires precisely D dimensions. This is the setting in which the sharp collapse to 50% accuracy can be proven via a counting argument on realizable sign patterns. In the revised manuscript, we will add a new subsection in the discussion (Section 5) explicitly addressing robustness. We clarify that if the ground-truth configuration has intrinsic dimension d' < D, then the collapse threshold shifts to c d' rather than c D, so embeddings of dimension cD may still succeed when cD > d'. For approximate satisfaction of the triplets, we note that the exact-case bound does not directly apply and that stability of Euclidean embeddings would be needed to extend the result; we leave a quantitative version for approximate realizability as future work rather than claiming it holds unchanged. This addition makes the scope of the theorem precise without overstating its applicability to noisy or lower-dimensional practical regimes. revision: partial

  2. Referee: [Proof of the information-theoretic bound] The derivation of the constant c and the specific counting or rigidity argument establishing the half-violation threshold are not fully detailed in the provided text. A proof sketch or key lemma (e.g., the volume or linear-algebraic step that forces the accuracy drop) is needed to verify that the bound is independent of the particular ground-truth configuration.

    Authors: We appreciate this observation. The full proof appears in the appendix, but the main-text exposition in Section 3 is indeed concise. In the revision we will insert a self-contained proof sketch immediately after the theorem statement. The argument proceeds by considering the arrangement of hyperplanes defined by the distance comparisons; each triplet corresponds to a sign pattern on the squared-distance differences. A D-dimensional configuration can realize at most 2^{O(D)} distinct sign patterns up to isometry, while there exist exponentially more (in the number of points) possible triplet sets. Any embedding in dimension at most cD with c < 1 can realize only a vanishing fraction of these patterns, forcing at least half the triplets to be violated for some configurations. The constant c arises from the point where the dimension-dependent growth rate of realizable orthants falls below the threshold needed to cover more than 50% of a worst-case triplet set. The argument relies only on the fact that the ground-truth points affinely span R^D (ensuring full-dimensional rigidity) and is therefore independent of the specific point locations. We will also add a short lemma formalizing the sign-pattern counting step. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation relies on external information-theoretic and UGC arguments

full rationale

The central claim is a theorem: if triplets are exactly realizable by a ground-truth embedding in precisely D Euclidean dimensions, then any embedding in dimension at most cD (c<1) must violate at least half the triplets. This is established via information-theoretic counting arguments and a hardness reduction under the Unique Games Conjecture (an external conjecture, not a self-citation). The exact-D realizability premise is stated explicitly as an assumption and is not derived from or equivalent to the collapse bound itself. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided derivation chain. The result is therefore self-contained against external benchmarks and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard properties of Euclidean distance and metric embeddings plus the Unique Games Conjecture; no free parameters, ad-hoc constants, or new entities are introduced.

axioms (2)
  • domain assumption Triplets encode strict distance comparisons realizable in Euclidean space of dimension D
    The main theorem assumes the input triplets come from some exact D-dimensional Euclidean embedding.
  • standard math Unique Games Conjecture holds
    Used for the computational hardness result that no poly-time algorithm exceeds 50% accuracy in any dimension.

pith-pipeline@v0.9.0 · 5553 in / 1381 out tokens · 99496 ms · 2026-05-07T14:22:13.266646+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

258 extracted references · 15 canonical work pages · 2 internal anchors

  1. [1]

    2001 , publisher=

    Random Graphs , author=. 2001 , publisher=

  2. [2]

    Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , pages=

    A cost function for similarity-based hierarchical clustering , author=. Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , pages=

  3. [3]

    Advances in Neural Information Processing Systems , volume=

    Subsampled power iteration: a unified algorithm for block models and planted csp's , author=. Advances in Neural Information Processing Systems , volume=

  4. [4]

    International Conference on Learning Representations , year=

    Decoupled Weight Decay Regularization , author=. International Conference on Learning Representations , year=

  5. [5]

    Transactions of the American Mathematical Society , volume=

    Lower bounds for approximation by nonlinear manifolds , author=. Transactions of the American Mathematical Society , volume=. 1968 , publisher=

  6. [6]

    Grothendieck, Alexandre , volume=. R. 1956 , publisher=

  7. [7]

    Constantes de Grothendieck et fonctions de type positif sur les spheres , author=. S

  8. [8]

    SIAM Journal on Computing , volume=

    Beating the random ordering is hard: Every ordering CSP is approximation resistant , author=. SIAM Journal on Computing , volume=. 2011 , publisher=

  9. [9]

    Zootaxa , volume=

    The tree of life web project , author=. Zootaxa , volume=. 2007 , publisher=

  10. [10]

    Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages=

    Strongly refuting random csps below the spectral threshold , author=. Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages=

  11. [11]

    Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , pages=

    The threshold for SDP-refutation of random regular NAE-3SAT , author=. Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , pages=. 2019 , organization=

  12. [12]

    Annals of Applied Probability , pages=

    Broadcasting on trees and the Ising model , author=. Annals of Applied Probability , pages=. 2000 , publisher=

  13. [13]

    Annals of Applied Probability , pages=

    Reconstruction on trees: beating the second eigenvalue , author=. Annals of Applied Probability , pages=. 2001 , publisher=

  14. [14]

    Journal of Bioinformatics and Computational Biology , volume=

    Inferring phylogenetic relationships avoiding forbidden rooted triplets , author=. Journal of Bioinformatics and Computational Biology , volume=. 2006 , publisher=

  15. [15]

    Probability Theory and Related Fields , volume=

    Evolutionary trees and the Ising model on the Bethe lattice: a proof of Steel’s conjecture , author=. Probability Theory and Related Fields , volume=. 2011 , publisher=

  16. [16]

    The Thirty Sixth Annual Conference on Learning Theory , pages=

    Learning and Testing Latent-Tree Ising Models Efficiently , author=. The Thirty Sixth Annual Conference on Learning Theory , pages=. 2023 , organization=

  17. [17]

    Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages=

    Satisfiability threshold for random regular NAE-SAT , author=. Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages=

  18. [18]

    Proceedings of the forty-seventh annual ACM symposium on Theory of computing , pages=

    Proof of the satisfiability conjecture for large k , author=. Proceedings of the forty-seventh annual ACM symposium on Theory of computing , pages=

  19. [19]

    2015 IEEE 56th Annual Symposium on Foundations of Computer Science , pages=

    How to refute a random CSP , author=. 2015 IEEE 56th Annual Symposium on Foundations of Computer Science , pages=. 2015 , organization=

  20. [20]

    Journal of Combinatorial Optimization , volume=

    Constructing the maximum consensus tree from rooted triples , author=. Journal of Combinatorial Optimization , volume=. 2004 , publisher=

  21. [21]

    PLoS One , volume=

    Oldest known pantherine skull and evolution of the tiger , author=. PLoS One , volume=. 2011 , publisher=

  22. [22]

    1979 , publisher=

    On the complexity of the satisfiability problem , author=. 1979 , publisher=

  23. [23]

    Discrete Applied Mathematics , volume=

    Probabilistic analysis of the Davis Putnam procedure for solving the satisfiability problem , author=. Discrete Applied Mathematics , volume=. 1983 , publisher=

  24. [24]

    Journal of the ACM (JACM) , volume=

    Many hard examples for resolution , author=. Journal of the ACM (JACM) , volume=. 1988 , publisher=

  25. [25]

    Aaai , volume=

    Hard and easy distributions of SAT problems , author=. Aaai , volume=

  26. [26]

    Advances in applied mathematics , volume=

    Extension operations on sets of leaf-labeled trees , author=. Advances in applied mathematics , volume=. 1995 , publisher=

  27. [27]

    Mathematical biosciences , volume=

    Closure operations in phylogenetics , author=. Mathematical biosciences , volume=. 2007 , publisher=

  28. [28]

    Transactions of the American Mathematical Society , volume=

    Phase transitions in phylogeny , author=. Transactions of the American Mathematical Society , volume=

  29. [29]

    Proceedings of the thirty-eighth annual ACM symposium on Theory of computing , pages=

    Optimal phylogenetic reconstruction , author=. Proceedings of the thirty-eighth annual ACM symposium on Theory of computing , pages=

  30. [30]

    British Journal of Mathematical and Statistical Psychology , volume=

    Tree structures for proximity data , author=. British Journal of Mathematical and Statistical Psychology , volume=. 1981 , publisher=

  31. [31]

    Science , volume=

    Critical behavior in the satisfiability of random boolean expressions , author=. Science , volume=. 1994 , publisher=

  32. [32]

    Artificial intelligence , volume=

    Generating hard satisfiability problems , author=. Artificial intelligence , volume=. 1996 , publisher=

  33. [33]

    Proceedings., 33rd Annual Symposium on Foundations of Computer Science , pages=

    Mick gets some (the odds are on his side)(satisfiability) , author=. Proceedings., 33rd Annual Symposium on Foundations of Computer Science , pages=. 1992 , organization=

  34. [34]

    Journal of the ACM (JACM) , volume=

    A computing procedure for quantification theory , author=. Journal of the ACM (JACM) , volume=. 1960 , publisher=

  35. [35]

    Communications of the ACM , volume=

    A machine program for theorem-proving , author=. Communications of the ACM , volume=. 1962 , publisher=

  36. [36]

    Molecular Phylogenetics and Evolution , volume=

    Supermatrix and species tree methods resolve phylogenetic relationships within the big cats, Panthera (Carnivora: Felidae) , author=. Molecular Phylogenetics and Evolution , volume=. 2010 , publisher=

  37. [37]

    SIAM Journal on Computing , volume=

    Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions , author=. SIAM Journal on Computing , volume=. 1981 , publisher=

  38. [38]

    Proceedings of the thiry-fourth annual ACM symposium on Theory of computing , pages=

    On the power of unique 2-prover 1-round games , author=. Proceedings of the thiry-fourth annual ACM symposium on Theory of computing , pages=

  39. [39]

    Discrete Applied Mathematics , volume=

    The approximability of maximum rooted triplets consistency with fan triplets and forbidden triplets , author=. Discrete Applied Mathematics , volume=. 2019 , publisher=

  40. [40]

    Discrete Applied Mathematics , volume=

    New results on optimizing rooted triplets consistency , author=. Discrete Applied Mathematics , volume=. 2010 , publisher=

  41. [41]

    Proceedings of the twenty-seventh annual ACM symposium on Theory of computing , pages=

    Polynomial time approximation schemes for dense instances of NP-hard problems , author=. Proceedings of the twenty-seventh annual ACM symposium on Theory of computing , pages=

  42. [42]

    Mathematical programming , volume=

    A new rounding procedure for the assignment problem with applications to dense graph arrangement problems , author=. Mathematical programming , volume=. 2002 , publisher=

  43. [43]

    SIAM Journal on Computing , volume=

    A polynomial time approximation scheme for inferring evolutionary trees from quartet topologies and its application , author=. SIAM Journal on Computing , volume=. 2001 , publisher=

  44. [44]

    SIAM Journal on Discrete Mathematics , volume=

    On the compatibility of quartet trees , author=. SIAM Journal on Discrete Mathematics , volume=. 2014 , publisher=

  45. [45]

    SIAM Journal on Discrete Mathematics , volume=

    On the maximum quartet distance between phylogenetic trees , author=. SIAM Journal on Discrete Mathematics , volume=. 2016 , publisher=

  46. [46]

    Journal of computational Biology , volume=

    Constructing phylogenies from quartets: elucidation of eutherian superordinal relationships , author=. Journal of computational Biology , volume=

  47. [47]

    IEEE/ACM Transactions on Computational Biology and Bioinformatics , volume=

    Quartets MaxCut: a divide and conquer quartets algorithm , author=. IEEE/ACM Transactions on Computational Biology and Bioinformatics , volume=. 2008 , publisher=

  48. [48]

    Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages=

    A characterization of strong approximation resistance , author=. Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages=

  49. [49]

    Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat

    Orchestrating quartets: approximation and data correction , author=. Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No. 98CB36280) , pages=. 1998 , organization=

  50. [50]

    SIAM Journal on Computing , volume=

    Reconstructing approximate phylogenetic trees from quartet samples , author=. SIAM Journal on Computing , volume=. 2012 , publisher=

  51. [51]

    Journal of Computational Biology , volume=

    Short quartet puzzling: A new quartet-based phylogeny reconstruction algorithm , author=. Journal of Computational Biology , volume=. 2008 , publisher=

  52. [52]

    Molecular phylogenetics and evolution , volume=

    Quartet MaxCut: a fast algorithm for amalgamating quartet trees , author=. Molecular phylogenetics and evolution , volume=. 2012 , publisher=

  53. [53]

    Systematic biology , volume=

    Weighted quartets phylogenetics , author=. Systematic biology , volume=. 2015 , publisher=

  54. [54]

    2004 , publisher=

    Inferring phylogenies , author=. 2004 , publisher=

  55. [55]

    Systematic Biology , volume=

    Consensus techniques and the comparison of taxonomic trees , author=. Systematic Biology , volume=. 1972 , publisher=

  56. [56]

    In KDD workshop on text mining, volume 400

    A comparison of document clustering techniques , author=. In KDD workshop on text mining, volume 400. Boston , year=

  57. [57]

    Journal of the ACM (JACM) , volume=

    Some optimal inapproximability results , author=. Journal of the ACM (JACM) , volume=. 2001 , publisher=

  58. [58]

    Bioconsensus: DIMACS Working Group Meetings on Bioconsensus: October 25-26, 2000 and October 2-5, 2001, DIMACS Center , volume=

    A Classification of Consensus Methods for Phylogenetics , author=. Bioconsensus: DIMACS Working Group Meetings on Bioconsensus: October 25-26, 2000 and October 2-5, 2001, DIMACS Center , volume=. 2003 , organization=

  59. [59]

    Trends in Ecology & Evolution , volume=

    Phylogenetic supertrees: assembling the trees of life , author=. Trends in Ecology & Evolution , volume=. 1998 , publisher=

  60. [60]

    Systematic Biology , volume=

    Simple but fundamental limitations on supertree and consensus tree methods , author=. Systematic Biology , volume=. 2000 , publisher=

  61. [61]

    Algorithmica , volume=

    Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology , author=. Algorithmica , volume=. 1999 , publisher=

  62. [62]

    Information Processing Letters , volume=

    On the agreement of many trees , author=. Information Processing Letters , volume=. 1995 , publisher=

  63. [63]

    SIAM Journal on Computing , volume=

    Computing the local consensus of trees , author=. SIAM Journal on Computing , volume=. 1998 , publisher=

  64. [64]

    Discrete applied mathematics , volume=

    Reconstruction of rooted trees from subtrees , author=. Discrete applied mathematics , volume=. 1996 , publisher=

  65. [65]

    Discrete Applied Mathematics , volume=

    A supertree method for rooted trees , author=. Discrete Applied Mathematics , volume=. 2000 , publisher=

  66. [66]

    Journal of the ACM (JACM) , volume=

    Improved algorithms for constructing consensus trees , author=. Journal of the ACM (JACM) , volume=. 2016 , publisher=

  67. [67]

    Journal of the American statistical association , volume=

    Hierarchical grouping to optimize an objective function , author=. Journal of the American statistical association , volume=. 1963 , publisher=

  68. [68]

    Proceedings of the National Academy of Sciences , volume=

    Cluster analysis and display of genome-wide expression patterns , author=. Proceedings of the National Academy of Sciences , volume=. 1998 , publisher=

  69. [69]

    Grouping multidimensional data: Recent advances in clustering , pages=

    A survey of clustering data mining techniques , author=. Grouping multidimensional data: Recent advances in clustering , pages=. 2006 , publisher=

  70. [70]

    2020 , publisher=

    Mining of massive data sets , author=. 2020 , publisher=

  71. [71]

    Nickel, Maximillian and Kiela, Douwe , journal=. Poincar

  72. [72]

    Nature , volume=

    Hierarchical structure and the prediction of missing links in networks , author=. Nature , volume=. 2008 , publisher=

  73. [73]

    Physical review E , volume=

    Hierarchical organization in complex networks , author=. Physical review E , volume=. 2003 , publisher=

  74. [74]

    2009 , publisher=

    The elements of statistical learning: data mining, inference, and prediction , author=. 2009 , publisher=

  75. [75]

    Journal of the Royal Statistical Society: Series A (General) , volume=

    A review of hierarchical classification , author=. Journal of the Royal Statistical Society: Series A (General) , volume=. 1987 , publisher=

  76. [76]

    European Journal of Combinatorics , volume=

    Low dimensional embeddings of ultrametrics , author=. European Journal of Combinatorics , volume=. 2004 , publisher=

  77. [77]

    Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms , pages=

    Efficient algorithms for computing the triplet and quartet distance between trees of arbitrary degree , author=. Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms , pages=. 2013 , organization=

  78. [78]

    Approximation, Randomization and Combinatorial Optimization

    Beating a random assignment , author=. Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques , pages=. 2005 , publisher=

  79. [79]

    Approximation, Randomization, and Combinatorial Optimization

    On the approximation resistance of a random predicate , author=. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques , pages=. 2007 , publisher=

  80. [80]

    Proceedings of the forty-first annual ACM symposium on Theory of computing , pages=

    Randomly supported independence and resistance , author=. Proceedings of the forty-first annual ACM symposium on Theory of computing , pages=

Showing first 80 references.