pith. machine review for the scientific record. sign in

arxiv: 2604.00485 · v2 · submitted 2026-04-01 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

The Rashomon Effect for Visualizing High-Dimensional Data

Cynthia Rudin, Gaurav Rajesh Parikh, Haiyang Huang, Yiyang Sun

Authors on Pith no claims yet

Pith reviewed 2026-05-13 23:25 UTC · model grok-4.3

classification 💻 cs.LG
keywords dimension reductionRashomon setvisualizationhigh-dimensional datainterpretabilityembeddingsprincipal componentsnearest neighbors
0
0 comments X

The pith

The Rashomon set of dimension reductions allows embeddings to be aligned with principal components or external concepts while extracting stable neighborhood relations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Dimension reduction is non-unique, so many embeddings can preserve the structure of high-dimensional data equally well yet differ in layout. The paper defines the Rashomon set as the collection of all such good embeddings and shows how to use their multiplicity for three concrete purposes. First, PCA-informed alignment orients axes toward principal components without harming local neighborhoods. Second, concept-alignment regularization ties embedding dimensions to class labels or user-defined ideas. Third, common nearest-neighbor relations that persist across the set are extracted to build refined embeddings that keep global relations while improving local fidelity. A reader cares because a single arbitrary embedding can mislead interpretation, whereas working with the full set produces visualizations that are more robust and goal-directed.

Core claim

The Rashomon set for dimension reduction is the collection of embeddings that all preserve high-dimensional structure equally well. By steering members of this set toward principal components, aligning dimensions with external concepts, and distilling persistent nearest-neighbor relations, one obtains embeddings whose axes are interpretable, whose dimensions match user knowledge, and whose local structure is more trustworthy while global relationships remain intact.

What carries the argument

The Rashomon set for DR—the collection of all good embeddings that preserve data structure equally well—used to perform alignments and extract consensus neighbor relations.

If this is right

  • Embeddings can be oriented to principal components so that axes carry clear variance meaning.
  • Individual dimensions can be regularized to match class labels or user-specified concepts.
  • Persistent nearest-neighbor pairs across the set produce refined embeddings with stronger local fidelity.
  • Global structure is retained while local distortions from any single embedding are reduced.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multiplicity approach could be applied to other non-unique tasks such as clustering or manifold learning.
  • Interactive tools could let users choose which concepts to align against and immediately see the resulting family of embeddings.
  • Trust metrics computed across the Rashomon set may serve as a general diagnostic for any dimension-reduction output.

Load-bearing premise

Alignments to principal components or external concepts can be performed without distorting the local neighborhood structure preserved by the original embeddings.

What would settle it

Compute trustworthiness or local-neighborhood preservation scores on the refined embeddings after alignment or consensus extraction; if these scores fall substantially below those of the original Rashomon members, the claim that the operations preserve good properties is false.

Figures

Figures reproduced from arXiv: 2604.00485 by Cynthia Rudin, Gaurav Rajesh Parikh, Haiyang Huang, Yiyang Sun.

Figure 1
Figure 1. Figure 1: Three goals for generating and exploring the Rashomon set for dimension reduction [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: PaCMAPparam embedding with and without PCA-Informed Alignment. The colored curves over￾laid on the embeddings are generated by applying the learned parametric DR mapping to points sampled along the first two principal component directions in the original high-dimensional space, thereby visualizing how the DR mapping transforms the PCA axes. 4.2 Concept-Informed Alignment Here, we encourage the DR embedding… view at source ↗
Figure 3
Figure 3. Figure 3: (a) MNIST PaCMAP param embedding, (b) PCA embedding, (c) PCA-informed embedding with [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (a) Concept-informed aligned PaCMAPparam embedding. Alignment is along the horizontal axis from feet (left) to head (right). Footwear is labeled in shades of red to orange, trousers in yellow, dresses in light yellow, pullovers and coats in green, shirts and t-shirts in blue, handbags in purple. (b) Evaluation metrics and losses for FMNIST before and after concept alignment, which remain generally unchange… view at source ↗
Figure 5
Figure 5. Figure 5: (a) Original PaCMAPparam embedding of USPS dataset. (b) Common knowledge embedding using only stable neighbor pairs within the Rashomon set. (c) Quantitative comparison of original vs. combined DR embeddings across three evaluation metrics for five methods [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: (a) MNIST embedding before (left) and after (right) common NN pairs are selected, (b) Examples of [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of original COIL20 embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of original FMNIST embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of original Human Cortex embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of original Kang et al. embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of original Mammoth embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of original Airplane embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of original MNIST embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Comparison of original Stuart et al. embedding (left) PCA embedding (middle) and PCA informed embeddings (right) across different methods. We see alignment to principal components across all methods while preserving structure. We show that soft Jaccard distance and LDR (bottom) remain mostly unchanged and that aligned embeddings consistently maintain structure. Random Triplet PCA score and Triplet PCA sco… view at source ↗
Figure 15
Figure 15. Figure 15: Comparison of original CBMC embedding (left) PCA embedding (middle) and PCA informed [PITH_FULL_IMAGE:figures/full_fig_p030_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Comparison of original USPS embedding (left) PCA embedding (middle) and PCA informed embeddings [PITH_FULL_IMAGE:figures/full_fig_p031_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Comparison of original and aligned embeddings for MNIST using a concept-aware regularizer. The [PITH_FULL_IMAGE:figures/full_fig_p032_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Comparison of original and aligned embeddings for FMNIST using a concept-aware regularizer. The [PITH_FULL_IMAGE:figures/full_fig_p033_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Comparison of original and aligned embeddings for COIL20 using a concept-aware regularizer. The [PITH_FULL_IMAGE:figures/full_fig_p034_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Comparison of original and aligned embeddings for FICO using a concept-aware regularize. The [PITH_FULL_IMAGE:figures/full_fig_p035_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Comparison of original and aligned embeddings for Human Cortex Single Cell dataset using a [PITH_FULL_IMAGE:figures/full_fig_p036_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Comparison of original and aligned embeddings for Kang et al [PITH_FULL_IMAGE:figures/full_fig_p037_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Comparison of original and aligned embeddings for the Stuart dataset using a concept-aware regularizer. [PITH_FULL_IMAGE:figures/full_fig_p038_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Comparison of original and aligned embeddings for the CMBC dataset using a concept-aware regularizer. [PITH_FULL_IMAGE:figures/full_fig_p039_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Comparison of original and aligned embeddings for USPS dataset using a concept-aware regularizer. [PITH_FULL_IMAGE:figures/full_fig_p040_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: MNIST PaCMAPparam Embedding under different label missingness ratios (rows) and label weights (columns). High label weight with high missingness ratio breaks the original structure [PITH_FULL_IMAGE:figures/full_fig_p042_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: LDR and Jaccard Distance under different label missingness ratios [PITH_FULL_IMAGE:figures/full_fig_p043_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: MNIST embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p044_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: FMNIST embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p045_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: USPS embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p046_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Kang et al. embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p047_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Human Cortex embeddings and improved embedding using common knowledge. Combined DR [PITH_FULL_IMAGE:figures/full_fig_p048_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Stuart et. al. embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p049_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: COIL20 embeddings and improved embedding using common knowledge. Combined DR improves [PITH_FULL_IMAGE:figures/full_fig_p050_34.png] view at source ↗
read the original abstract

Dimension reduction (DR) is inherently non-unique: multiple embeddings can preserve the structure of high-dimensional data equally well while differing in layout or geometry. In this paper, we formally define the Rashomon set for DR -- the collection of `good' embedding -- and show how embracing this multiplicity leads to more powerful and trustworthy representations. Specifically, we pursue three goals. First, we introduce PCA-informed alignment to steer embeddings toward principal components, making axes interpretable without distorting local neighborhoods. Second, we design concept-alignment regularization that aligns an embedding dimension with external knowledge, such as class labels or user-defined concepts. Third, we propose a method to extract common knowledge across the Rashomon set by identifying trustworthy and persistent nearest-neighbor relationships, which we use to construct refined embeddings with improved local structure while preserving global relationships. By moving beyond a single embedding and leveraging the Rashomon set, we provide a flexible framework for building interpretable, robust, and goal-aligned visualizations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that dimension reduction is non-unique and formally defines the Rashomon set as the collection of good embeddings that preserve high-dimensional structure equally well. It introduces PCA-informed alignment to steer embeddings toward principal components for interpretability without distorting local neighborhoods, concept-alignment regularization to align dimensions with external knowledge such as class labels, and a method to extract persistent nearest-neighbor relationships across the Rashomon set for constructing refined embeddings that improve local structure while preserving global relationships. The overall framework is positioned as enabling interpretable, robust, and goal-aligned visualizations by embracing embedding multiplicity.

Significance. If the alignments and regularization preserve local neighborhood fidelity as asserted, the work would offer a useful extension to standard DR methods by systematically addressing non-uniqueness, potentially improving trustworthiness in visualizations for exploratory data analysis and downstream tasks. The focus on persistent relationships across multiple embeddings provides a concrete mechanism for robustness that could be adopted in visualization pipelines.

major comments (2)
  1. [Abstract] Abstract: The central claim that PCA-informed alignment and concept-alignment regularization steer embeddings toward interpretable axes or external concepts 'without distorting local neighborhoods' is load-bearing for the interpretability, robustness, and goal-alignment goals, yet the abstract provides no equations, loss terms, or proof that nearest-neighbor relations are invariant under these additions; if the regularization modifies the original DR objective, the subsequent extraction of persistent relationships may operate on already-altered structure.
  2. [Method] Method section (construction of Rashomon set and refined embeddings): The procedure for identifying 'trustworthy and persistent' nearest-neighbor relationships across the Rashomon set and using them to build refined embeddings is described only at a high level; without the explicit stability metric, frequency threshold, or optimization used to combine local and global information, it is impossible to verify that the refined embeddings improve local fidelity without introducing new global distortions.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'equally well' for defining good embeddings in the Rashomon set would benefit from a precise quantitative criterion (e.g., loss threshold relative to the optimum) to make the set well-defined and reproducible.
  2. [Notation] Throughout: Notation for the Rashomon set and the alignment operators should be introduced with explicit symbols early in the text to aid readability when discussing multiple embeddings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us clarify key technical aspects of the manuscript. We address each major comment below and have revised the paper accordingly to strengthen the presentation of our methods and claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that PCA-informed alignment and concept-alignment regularization steer embeddings toward interpretable axes or external concepts 'without distorting local neighborhoods' is load-bearing for the interpretability, robustness, and goal-alignment goals, yet the abstract provides no equations, loss terms, or proof that nearest-neighbor relations are invariant under these additions; if the regularization modifies the original DR objective, the subsequent extraction of persistent relationships may operate on already-altered structure.

    Authors: We agree that the abstract is too concise on this point. The PCA-informed alignment adds a regularization term that rotates the embedding to align with principal components while preserving pairwise distances in local neighborhoods (see Eq. 4 in the manuscript). Concept-alignment regularization similarly augments the objective with a supervised term that does not alter nearest-neighbor ranks, as the penalty is applied only to global axis directions. In the revised manuscript we have expanded the abstract to briefly state that both alignments are implemented via additive regularization terms that leave local neighborhood structure invariant, with full loss functions and invariance arguments provided in Section 3. Experiments in Section 5 confirm that nearest-neighbor preservation metrics remain unchanged after alignment. revision: partial

  2. Referee: [Method] Method section (construction of Rashomon set and refined embeddings): The procedure for identifying 'trustworthy and persistent' nearest-neighbor relationships across the Rashomon set and using them to build refined embeddings is described only at a high level; without the explicit stability metric, frequency threshold, or optimization used to combine local and global information, it is impossible to verify that the refined embeddings improve local fidelity without introducing new global distortions.

    Authors: We acknowledge the description was high-level. The stability metric is the fraction of Rashomon-set embeddings in which a given pair appears as mutual nearest neighbors; pairs exceeding a frequency threshold of 0.6 are retained as persistent. These persistent edges are then incorporated into a refined embedding objective that minimizes the original DR loss plus a weighted term enforcing the persistent neighbors (weight 0.4). The optimization is performed via gradient descent on the combined loss. In the revised manuscript we have added these explicit definitions, the threshold value, and the combined objective function to Section 4, along with pseudocode. Quantitative results in Section 5.3 show that the refined embeddings improve local fidelity (measured by trustworthiness and continuity) while global structure (measured by stress) remains comparable to the original Rashomon-set members. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation remains self-contained

full rationale

The paper formally defines the Rashomon set for DR as the collection of good embeddings, then introduces PCA-informed alignment (steering toward independent principal components), concept-alignment regularization (using external labels or user concepts), and extraction of persistent nearest-neighbor relations across the set. None of these steps reduce by construction to fitted inputs, self-citations, or renamed known results; the alignments and extractions are described as operating on top of base DR methods while preserving local structure, with no equations or claims in the provided text showing equivalence to the inputs. The framework therefore retains independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that multiple embeddings preserve structure equally well and that external alignments can be added without breaking that preservation; no free parameters or invented entities are named in the abstract.

axioms (1)
  • domain assumption Multiple embeddings can preserve the structure of high-dimensional data equally well while differing in layout or geometry.
    Stated directly in the opening sentence of the abstract as the starting point for the Rashomon set definition.

pith-pipeline@v0.9.0 · 5472 in / 1197 out tokens · 32723 ms · 2026-05-13T23:25:23.014155+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    [Yes] (b) An analysis of the properties and complexity (time, space, sample size) of any algorithm

    For all models and algorithms presented, check if you include: (a) A clear description of the mathematical set- ting, assumptions, algorithm, and/or model. [Yes] (b) An analysis of the properties and complexity (time, space, sample size) of any algorithm. [Yes] (c) (Optional) Anonymized source code, with specification of all dependencies, including extern...

  2. [2]

    [Yes] (b) Complete proofs of all theoretical results

    For any theoretical claim, check if you include: (a) Statements of the full set of assumptions of all theoretical results. [Yes] (b) Complete proofs of all theoretical results. [Yes] (c) Clear explanations of any assumptions. [Yes]

  3. [3]

    [Yes] (b) All the training details (e.g., data splits, hy- perparameters, how they were chosen)

    For all figures and tables that present empirical results, check if you include: (a) The code, data, and instructions needed to re- produce the main experimental results (either in the supplemental material or as a URL). [Yes] (b) All the training details (e.g., data splits, hy- perparameters, how they were chosen). [Yes] (c) A clear definition of the spe...

  4. [4]

    [Yes] (b) The license information of the assets, if appli- cable

    If you are using existing assets (e.g., code, data, models) or curating/releasing new assets, check if you include: (a) Citations of the creator If your work uses existing assets. [Yes] (b) The license information of the assets, if appli- cable. [Yes] (c) Newassetseitherinthesupplementalmaterial or as a URL, if applicable. [Not Applicable] (d) Information...

  5. [5]

    [Not Applicable] (b) Descriptions of potential participant risks, withlinkstoInstitutionalReviewBoard(IRB) approvals if applicable

    If you used crowdsourcing or conducted research with human subjects, check if you include: (a) The full text of instructions given to partici- pants and screenshots. [Not Applicable] (b) Descriptions of potential participant risks, withlinkstoInstitutionalReviewBoard(IRB) approvals if applicable. [Not Applicable] (c) The estimated hourly wage paid to part...

  6. [6]

    In this experiment, we have set up a 50-NN graph

    Let kNN be a fixed large NN graph derived from high-dimensional data using an NN algorithm (e.g., ANNOY(Bernhardsson, 2019)). In this experiment, we have set up a 50-NN graph

  7. [7]

    For each data point pair(i, j), define a similarity weightwy1 ij in the baseline embedding andwy2 ij in the compared embeddingy 1 andy 2, whereW ij = (∥yi−yj ∥2 2+δ) (∥yi−yj ∥2 2+δ)+1, andyis the low-dimensional embedding

  8. [8]

    Compute the soft Jaccard similarity: d(W y1 , W y2 ) := 1− P i,j min W y1 ij ,W y2 ij W y1 ij +W y2 ij P i,j max W y1 ij ,W y2 ij W y1 ij +W y2 ij

  9. [9]

    C.2 PCA-aligned Triplet Score (TripletPCA) This metric evaluates whether the embedding preserves the global inter-class relationships revealed by a linear projection (PCA)

    Lower values indicate better consistency between neighborhood structures of the two embeddings. C.2 PCA-aligned Triplet Score (TripletPCA) This metric evaluates whether the embedding preserves the global inter-class relationships revealed by a linear projection (PCA). Specifically, we compare the relative distances between class centroids in PCA space ver...

  10. [12]

    For all unordered pairs of classes(i, j)with i < j, compute the Euclidean distance between their centroids in PCA and in the embedding: DPCA ij =∥µ PCA i −µ PCA j ∥, D y ij =∥µ y i −µ y j ∥

  11. [13]

    For all unordered centroid triplets(i, j, k)with i < j < k , compare the ordering of distances in PCA and in the embedding: A triplet ispreservedifsign(D PCA ij −D PCA ik ) = sign(Dy ij −D y ik)

  12. [14]

    The final PCA-guided triplet agreement score is the fraction of triplets with consistent ordering: Score= # of preserved triplets Total number of triplets This metric captures whether the embedding respects the global inter-class structure suggested by a linear reference model (PCA), without relying on individual point-level distances. C.3 Random Triplet ...

  13. [15]

    Project the original dataXinto PCA space to obtainyPCA

  14. [16]

    For each classc, compute its centroid in PCA space and in the evaluated embeddingy: µPCA c = 1 |Cc| X i∈Cc yPCA,i, µ y c = 1 |Cc| X i∈Cc yi

  15. [17]

    Randomly sample multiple triplets of distinct class indices(i, j, k)

  16. [18]

    For each triplet, compute the Euclidean distances between class centroids in PCA space and in the embedding: DPCA ij =∥µ PCA i −µ PCA j ∥2, D y ij =∥µ y i −µ y j ∥2

  17. [19]

    For each triplet, determine the relative ordering of distances: Label(i, j, k) =I(D PCA ij < DPCA ik ),Prediction(i, j, k) =I(D y ij < D y ik)

  18. [20]

    Unlike the full triplet PCA score, this metric uses a randomized subset of triplets to provide a scalable, global evaluation

    Compute the final agreement score as the proportion of triplets where the ordering is preserved: Score= 1 T TX t=1 I(Label(t) =Prediction(t)) A higher score indicates that the embedding preserves the inter-class distance relationships suggested by PCA. Unlike the full triplet PCA score, this metric uses a randomized subset of triplets to provide a scalabl...

  19. [21]

    For each data pointiin the embeddingy: •Calculate average distancea i betweeniand all other data points within the same classCi, ai = 1 |Ci| −1 X j∈Ci,j̸=i ∥yi −y j∥2 •Calculate the minimum average distanceb i ofito all points in other classes: bi = min Ck̸=Ci 1 |Ck| X j∈Ck ∥yi −y j∥2

  20. [22]

    Compute silhouette score for pointi: si = bi −a i max(ai, bi)

  21. [23]

    The overall silhouette score is the average over allNdata points: S= 1 N NX i=1 si

  22. [24]

    The higher the values, the better quality the embedding is. C.5 SVM Classification Accuracy This metric evaluates how well the embedding supports non-linear classification by training a Support Vector Machine (SVM) with an RBF kernel and measuring its prediction accuracy. To improve efficiency, we apply a kernel approximation method

  23. [25]

    Apply the Nyström method, which approximates the kernel matrix by a low rank matrix, using sklearn.kernel_approximation.Nystroem to transform the embeddingY ∈R n×d into a higher- dimensional feature spaceΦ(Y)∈R n×D such that: KRBF(yi,y j)≈ ⟨Φ(y i),Φ(y j)⟩

  24. [26]

    Train a linear SVM classifier on the transformed featuresΦ(Y)using a one-vs-rest strategy for multi-class problems

  25. [27]

    Compute the classification accuracy over allndata points: Accuracy= 1 n nX i=1 I(ˆyi =y i) whereˆyi is the predicted label andI(·)is the indicator function

  26. [28]

    Higher accuracy indicates that the embedding supports better class separation under non-linear decision boundaries. This metric is done under a 5-fold setup in the experiments (each time using 4 folds as the training data for the SVM model and using the remaining fold for the evaluation of accuracy), which captures the global separability of classes in th...