arxiv: 2605.00716 · v1 · submitted 2026-05-01 · 💻 cs.LG · cs.SI

Recognition: unknown

Aitchison Embeddings for Learning Compositional Graph Representations

Chrysoula Kosma, Giannis Nikolentzos, Michail Chatzianastasis, Nikolaos Nakis, Panagiotis Promponas

Pith reviewed 2026-05-09 19:24 UTC · model grok-4.3

classification 💻 cs.LG cs.SI

keywords graph embeddingsAitchison geometrycompositional data analysisisometric log-ratio transformationinterpretable representationsnode classificationlink predictionarchetypal analysis

0 comments

The pith

Graph nodes represented as simplex compositions yield intrinsically interpretable embeddings that reflect archetype trade-offs and remain coherent under component restriction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a compositional framework for graph representation learning based on Aitchison geometry. Nodes are modeled as mixtures of latent archetypal factors on the simplex, then mapped to Euclidean space using isometric log-ratio coordinates. The transformation preserves the natural distances between compositions while allowing standard machine learning optimization. As a result, the embeddings are interpretable by construction, showing how nodes balance different roles. The framework also enables subcompositional analysis to examine the influence of specific archetype groups on predictions and tasks like node classification and link prediction.

Core claim

The central discovery is a new embedding method where each node is a composition in the simplex representing its proportional affiliation with archetypal roles. These compositions are isometrically embedded into Euclidean space via fixed or learnable ILR bases, ensuring that Aitchison distances—which capture relative differences in mixture proportions—are exactly preserved as Euclidean distances. This setup supports unconstrained optimization for tasks such as link prediction and node classification, while the geometry inherently encodes relative trade-offs and permits subcompositional coherence when restricting the set of considered archetypes.

What carries the argument

Isometric log-ratio (ILR) coordinates of simplex-valued node compositions, which serve as the bridge between Aitchison geometry on the simplex and Euclidean optimization, preserving distances and enabling interpretability of relative archetype abundances.

If this is right

Competitive accuracy on node classification and link prediction benchmarks compared to standard graph embedding methods.
Built-in explainability through the geometric meaning of coordinates as log-ratios of archetype proportions.
Ability to perform subcompositional dimensionality reduction by removing and renormalizing archetype subsets without losing geometric validity.
Coherent behavior under component restriction, allowing analysis of how particular archetype groups drive representations and predictions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This method could be applied to other mixture-based data structures beyond graphs, such as topic models or ecological networks.
Learnable ILR bases might adapt to specific graph structures, potentially improving performance in heterogeneous networks.
The subcompositional coherence suggests natural ways to handle noisy or incomplete role information in real-world graphs.

Load-bearing premise

Networks can be viewed as having nodes that are mixtures over a fixed set of latent archetypal factors.

What would settle it

If on a standard graph dataset the Aitchison-based embeddings produce significantly lower accuracy on link prediction or node classification than Euclidean baselines, or if restricting components does not yield consistent changes in predictions.

Figures

Figures reproduced from arXiv: 2605.00716 by Chrysoula Kosma, Giannis Nikolentzos, Michail Chatzianastasis, Nikolaos Nakis, Panagiotis Promponas.

**Figure 1.** Figure 1: Overview of AICoG. Nodes are represented as compositions on the simplex and compared in Aitchison geometry. An ILR isometry maps compositions to an unconstrained Euclidean space, where distances preserve Aitchison distances. where I ∈ R (K−1)×(K−1) is the identity matrix. Given a node role zi ∈ ∆K−1 , its ILR coordinates are defined as xi = ILR(zi) = log(zi) ⊤V ∈ R K−1 , (5) where the logarithm is applied … view at source ↗

**Figure 2.** Figure 2: Cora dataset (D=8). Label-wise distributions along ILR balances under three valid bases: Helmert (left), learned (center), and varimax-rotated learned (right). Top row shows balance loadings (archetypal contributions to each log-ratio contrast); bottom row shows label-wise distributions of the corresponding ILR coordinates. −7.5 −5.0 −2.5 0.0 2.5 5.0 7.5 PC1 (20.1% var) -8.0 -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 … view at source ↗

**Figure 3.** Figure 3: Interpretable trade-off trajectories under subcomposition (Cora). Node embeddings are shown in a 2D PCA projection of ILR coordinates (Helmert basis), with PCA fit separately for each K and nodes colored by label. The overlaid curve traces a paired log-ratio intervention applied to the same node across panels, increasing archetype a and decreasing b followed by closure. The intervention is defined intrinsi… view at source ↗

**Figure 4.** Figure 4: Subcompositional evaluation on CORA (trained at D=64). We evaluate semantically meaningful component removal by restricting each simplex-based representation to K′ components, applying closure, and probing the resulting D=K′ − 1 embeddings without retraining. Curves are averaged over 50 random removal masks; higher is better in both panels. pronounced under aggressive compression. 8 [PITH_FULL_IMAGE:fig… view at source ↗

read the original abstract

Representation learning is central to graph machine learning, powering tasks such as link prediction and node classification. However, most graph embeddings are hard to interpret, offering limited insight into how learned features relate to graph structure. Many networks naturally admit a role-mixture view, where nodes are best described as mixtures over latent archetypal factors. Motivated by this structure, we propose a compositional graph embedding framework grounded in Aitchison geometry, the canonical geometry for comparing mixtures. Nodes are represented as simplex-valued compositions and embedded via isometric log-ratio (ILR) coordinates, which preserve Aitchison distances while enabling unconstrained optimization in Euclidean space. This yields intrinsically interpretable embeddings whose geometry reflects relative trade-offs among archetypes and supports coherent behavior under component restriction; we consider both fixed and learnable ILR bases. Across node classification and link prediction, our method achieves competitive performance with strong baselines while providing explainability by construction rather than post-hoc. Finally, subcompositional coherence enables principled component restriction: removing and renormalizing subsets preserves a well-defined geometry, which we exploit via subcompositional dimensionality removal to probe how archetype groups influence representations and predictions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps graph nodes to simplex compositions via ILR coordinates for built-in interpretability, but the role-mixture assumption looks more imposed than derived.

read the letter

The main takeaway is that this work adapts Aitchison geometry and isometric log-ratio transforms to produce graph embeddings that stay on the simplex. Nodes become compositions over latent archetypes, optimized in Euclidean space while preserving Aitchison distances, and the method supports both fixed and learnable bases plus subcompositional restriction for probing groups of components. That combination is new relative to standard graph embedding papers, and the distance preservation follows directly from known ILR properties, so the geometry itself is on solid ground. They also get competitive numbers on node classification and link prediction without post-hoc tools, which is a practical plus if the interpretability holds up. The soft spot is the starting premise that many networks naturally fit a role-mixture view. The paper states this as motivation but does not derive it from the graph structure or run checks showing that the evaluated datasets actually behave like mixtures over archetypes rather than other structures. If that premise is off, the simplex constraint becomes an extra prior instead of a natural fit, and claims about intrinsic interpretability or meaningful archetype trade-offs rest more on the math than on demonstrated relevance to the data. Subcompositional coherence is a nice technical feature, but its value for real insight depends on the same assumption. This is worth a serious referee for anyone working on interpretable or compositional graph methods, since the construction is clean and the performance is at least on par with baselines. I would send it to review rather than desk reject, with the expectation that revisions would need to address how well the role-mixture view actually fits the tasks.

Referee Report

1 major / 1 minor

Summary. The paper proposes Aitchison Embeddings for graph representations, where nodes are represented as simplex-valued compositions over latent archetypal factors. These compositions are embedded into Euclidean space using isometric log-ratio (ILR) coordinates, which preserve Aitchison distances. The framework supports both fixed and learnable bases, achieves competitive performance on node classification and link prediction, and provides intrinsic interpretability along with subcompositional coherence for component restriction.

Significance. Should the central claims hold, particularly the natural fit of the role-mixture model to graph nodes and the resulting interpretability, this work would offer a geometrically grounded alternative to standard graph embeddings with built-in explainability. It applies established tools from compositional data analysis (ILR isometry) to graphs, which could be valuable if the performance is indeed competitive without post-hoc explanations. The subcompositional property is a standard feature but its exploitation for dimensionality probing is a nice touch.

major comments (1)

[Abstract] The assertion that 'many networks naturally admit a role-mixture view, where nodes are best described as mixtures over latent archetypal factors' is presented without derivation, validation, or references. This premise is central to the significance of the interpretability claims ('intrinsically interpretable embeddings whose geometry reflects relative trade-offs among archetypes'), as without it the simplex constraint and Aitchison geometry may represent an imposed modeling choice rather than a discovery from the data. The manuscript should include analysis showing that this view is appropriate for the evaluated graphs.

minor comments (1)

The abstract is quite dense; separating the technical description of ILR embedding from the claims of interpretability and performance would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the single major comment below and outline planned revisions to strengthen the motivation and validation of the role-mixture modeling assumption.

read point-by-point responses

Referee: [Abstract] The assertion that 'many networks naturally admit a role-mixture view, where nodes are best described as mixtures over latent archetypal factors' is presented without derivation, validation, or references. This premise is central to the significance of the interpretability claims ('intrinsically interpretable embeddings whose geometry reflects relative trade-offs among archetypes'), as without it the simplex constraint and Aitchison geometry may represent an imposed modeling choice rather than a discovery from the data. The manuscript should include analysis showing that this view is appropriate for the evaluated graphs.

Authors: We agree that the role-mixture premise would benefit from explicit supporting references and targeted validation on the evaluated graphs. In the revised manuscript we will expand the introduction and related-work section with citations to the mixed-membership stochastic block model literature (e.g., Airoldi et al., 2008) and role-discovery papers that empirically document overlapping or mixed node roles in real networks. We will also add a concise analysis subsection in the experiments that examines the learned compositions on the node-classification and link-prediction benchmarks. This analysis will report simple statistics (entropy of the simplex vectors and fraction of nodes with non-negligible mass on multiple factors) to demonstrate that the model recovers non-degenerate mixtures rather than collapsing to pure archetypes. These additions will clarify that the simplex constraint is a deliberate modeling choice motivated by interpretability and subcompositional coherence, while showing that it is empirically reasonable for the graphs considered. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new construction from standard compositional geometry

full rationale

The paper's derivation begins with the modeling assumption that nodes can be represented as simplex-valued compositions over latent archetypes (motivated but not derived from graph data), then applies the standard ILR isometry from Aitchison geometry to obtain Euclidean embeddings. This is a direct construction: the claimed interpretability and subcompositional coherence follow immediately from the properties of the ILR transform and simplex renormalization, without any fitted parameter being relabeled as a prediction, without self-citation chains justifying uniqueness, and without renaming an existing result. The method is presented as a new framework rather than a re-expression of its own outputs, and the technical steps (fixed vs. learnable bases, subcompositional restriction) remain independent of the target task performance.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; free parameters, axioms, and invented entities cannot be audited without the full manuscript.

pith-pipeline@v0.9.0 · 5516 in / 1037 out tokens · 13490 ms · 2026-05-09T19:24:21.882774+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rank Is Not Capacity: Spectral Occupancy for Latent Graph Models
cs.LG 2026-05 unverdicted novelty 7.0

Spectra defines and controls effective capacity in graph embeddings via the Shannon effective rank of a trace-normalized kernel spectrum, making capacity a post-fit property rather than a pre-training hyperparameter.

Reference graph

Works this paper leans on

19 extracted references · 14 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Learning role-based graph embeddings.arXiv preprint arXiv:1802.02896, 2018

Ahmed, N. K., Rossi, R., Lee, J. B., Willke, T. L., Zhou, R., Kong, X., and Eldardiry, H. Learning role-based graph embeddings.arXiv preprint arXiv:1802.02896,

work page arXiv
[2]

Mixed membership stochastic blockmodels, 2007

URL https://arxiv.org/abs/0705.4485. Aitchison, J. The Statistical Analysis of Compositional Data.Journal of the Royal Statistical Society: Series B (Methodological), 44(2):139–160,

work page arXiv
[3]

arXiv preprint arXiv:1905.13686 (2019)

Baldassarre, F. and Azizpour, H. Explainability tech- niques for graph convolutional networks.arXiv preprint arXiv:1905.13686,

work page arXiv 1905
[4]

Isometric

doi: 10.1023/A:1023818214614. Epasto, A. and Perozzi, B. Is a Single Embedding Enough? Learning Node Representations that Capture Multiple 9 Submission and Formatting Instructions for ICML 2026 Social Contexts. InThe World Wide Web Conference, pp. 394–404,

work page doi:10.1023/a:1023818214614 2026
[5]

org/abs/2201.05197

URL https://arxiv. org/abs/2201.05197. Grover, A. and Leskovec, J. node2vec: Scalable Feature Learning for Networks. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining, pp. 855–864,

work page arXiv
[6]

Journal of the American Statistical Association , author =

doi: 10.1198/016214502388618906. URL https: //doi.org/10.1198/016214502388618906. Holland, P. W., Laskey, K. B., and Leinhardt, S. Stochastic blockmodels: First steps.Social Net- works, 5(2):109–137,

work page doi:10.1198/016214502388618906
[7]

doi: https://doi.org/10.1016/0378-8733(83)90021-7

ISSN 0378-8733. doi: https://doi.org/10.1016/0378-8733(83)90021-7. URL https://www.sciencedirect.com/ science/article/pii/0378873383900217. Jin, J., Ke, Z. T., and Luo, S. Mixed membership estima- tion for social networks,

work page doi:10.1016/0378-8733(83)90021-7
[8]

org/abs/1708.07852

URL https://arxiv. org/abs/1708.07852. Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

work page arXiv
[9]

Semi-Supervised Classification with Graph Convolutional Networks

Kipf, T. Semi-supervised classification with graph con- volutional networks.arXiv preprint arXiv:1609.02907,

work page internal anchor Pith review arXiv
[10]

J., Bulusu, K

Lin, C., Sun, G. J., Bulusu, K. C., Dry, J. R., and Hernandez, M. Graph neural networks including sparse interpretabil- ity.arXiv preprint arXiv:2007.00119,

work page arXiv 2007
[11]

Hm-ldm: A hybrid-membership latent distance model, 2022

URL https://arxiv.org/abs/2206.03463. Nakis, N., Celikkanat, A., Boucherie, L., Djurhuus, C., Burmester, F., Holmelund, D. M., Frolcov ´a, M., and Mørup, M. Characterizing polarization in social networks using the signed relational latent distance model. In Ruiz, F., Dy, J., and van de Meent, J.-W. (eds.),Proceedings of The 26th International Conference o...

work page arXiv
[12]

What do gnns actually learn? towards understanding their representations, 2024

URL https://arxiv.org/ abs/2304.10851. Perozzi, B., Al-Rfou, R., and Skiena, S. DeepWalk: On- line Learning of Social Representations. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710,

work page arXiv
[13]

Don’t walk, skip! online learning of multi-scale network em- beddings

Perozzi, B., Kulkarni, V ., Chen, H., and Skiena, S. Don’t walk, skip! online learning of multi-scale network em- beddings. InProceedings of the 2017 IEEE/ACM In- 10 Submission and Formatting Instructions for ICML 2026 ternational Conference on Advances in Social Networks Analysis and Mining, pp. 258–265,

2017
[14]

org/abs/2005.07959

URL https://arxiv. org/abs/2005.07959. Rozemberczki, B., Kiss, O., and Sarkar, R. Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs. InProceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM ’20), pp. 3125–3132. ACM,

work page arXiv 2005
[15]

Ying, C., Cai, T., Luo, S., Zheng, S., Ke, G., He, D., Shen, Y ., and Liu, T.-Y

doi: 10.1086/226141. Ying, C., Cai, T., Luo, S., Zheng, S., Ke, G., He, D., Shen, Y ., and Liu, T.-Y . Do transformers really perform badly for graph representation?Advances in neural information processing systems, 34:28877–28888,

work page doi:10.1086/226141
[16]

We include formal proofs and derivations, details on the ILR parameterization, complete experimental settings and hyperparameters, and additional empirical results

11 Submission and Formatting Instructions for ICML 2026 Appendix This appendix provides supplementary material supporting the main paper. We include formal proofs and derivations, details on the ILR parameterization, complete experimental settings and hyperparameters, and additional empirical results. These materials clarify the methodology and enable ful...

2026
[17]

Fix a subset S⊆ {1,

The ILR coordinates ofz i are the(k−1)-dimensional row vector ILR(zi)≜log(z i)⊤V∈R k−1. Fix a subset S⊆ {1, . . . , k} with |S|=k ′ ≥2 . Let RS ∈R k′×k be the coordinate-selection matrix that extracts the entries inS, so thatR Sz= (z r)r∈S ∈R k′ for anyz∈R k. Define the closure (re-normalization) mapC:R k′ >0 →∆ k′−1 ◦ by C(u)≜ u 1⊤ k′u . The reclosed sub...

2026
[18]

=σ(α−g(∥x i −x j∥2)), x i ∈R k−1. Proof. Since ILR is bijective, for any collection {xi}n i=1 ⊂R k−1 there exists a unique collection {zi}n i=1 ⊂∆ k−1 ◦ such thatx i = ILR(zi)for alli, and converselyz i = ILR−1(xi)exists for alli. BecauseILRis an isometry, for alli < j, ∥xi −x j∥2 =∥ILR(z i)−ILR(z j)∥2. Substituting this identity into the respective expre...

2026
[19]

Link prediction.For link prediction, we follow the widely adopted evaluation protocol of Perozzi et al

For SLIM-RAA, HM-LDM, MMSBM, and SIMPLEX-EUCLIDEAN, we optimized the negative Bernoulli log-likelihood (matching AICOG up to the model-specific log-odds parameterization) using learning rate 0.05 for 5,000 epochs, so differences are attributable only to the log-odds form. Link prediction.For link prediction, we follow the widely adopted evaluation protoco...

2014