Recognition: no theorem link
The WidthWall: A Strict Expressivity Hierarchy for Hypergraph Neural Networks
Pith reviewed 2026-05-14 20:22 UTC · model grok-4.3
The pith
Hypergraph neural networks cannot represent invariants beyond a fixed hypertree width.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Homomorphism densities generate all continuous hypergraph invariants and organize them into a strict hierarchy indexed by hypertree width. This produces a Width Wall that no fixed-depth hypergraph neural network can cross, even with arbitrary hidden dimensions or training.
What carries the argument
Homomorphism densities that quantify motif frequencies, arranged into a hierarchy by hypertree width.
If this is right
- Clique expansion necessarily discards information carried by patterns wider than the chosen clique size.
- Fifteen distinct HGNN architectures receive a uniform characterization by the maximum hypertree width they can access.
- Density-aware architectures can represent invariants lying above the width bound of ordinary message passing.
- No increase in model size or training data overcomes the limit for invariants requiring wider hypertrees.
Where Pith is reading between the lines
- Node-classification performance on real hypergraphs should degrade precisely when the task requires patterns above the model's width.
- Explicit computation of homomorphism densities offers one route to bypass the wall without increasing depth.
- The same width-based separation may appear in other relational models that rely on local aggregation over higher-order relations.
Load-bearing premise
Homomorphism densities together with invariant approximation fully capture the continuous invariants relevant to HGNN expressivity and the resulting hierarchy is strict for every architecture considered.
What would settle it
A fixed-depth HGNN that accurately distinguishes or classifies hypergraphs differing only by an invariant whose minimal representation requires a hypertree width strictly larger than the model's message-passing width.
Figures
read the original abstract
Hypergraphs provide a natural framework to model higher-order interactions in scientific, social, and biological systems. Hypergraph neural networks (HGNNs) aim to learn from such data, yet it remains unclear which higher-order structures these models can represent. We show that hypergraph expressivity is governed by which small patterns an architecture can detect and count. We formalize this via homomorphism densities, which measure how often a structural motif appears in a hypergraph. Combining classical homomorphism-count completeness with invariant approximation, we show that homomorphism densities generate all continuous hypergraph invariants and organize them into a strict hierarchy indexed by hypertree width. This yields a Width Wall: a fundamental architectural limit beyond which no hidden dimension, training procedure or fixed-depth HGNN can represent invariants requiring wider patterns. Our framework provides a unified characterization of 15 HGNN architectures, precisely identifies information lost by clique expansion, and motivates density-aware models that extend expressivity beyond bounded-width message passing. We experimentally validate this finding on an APPLICATION NODE CLASSIFICATION SUITE of real-world hypergraphs, where the Width Wall predicts when graph-reduction baselines fail and when density features help.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that hypergraph expressivity is governed by homomorphism densities, which measure motif occurrences. Combining classical homomorphism-count completeness with invariant approximation, it shows these densities generate all continuous hypergraph invariants and induce a strict hierarchy indexed by hypertree width. This establishes the Width Wall: no hidden dimension, training procedure, or fixed-depth HGNN can represent invariants requiring wider patterns. The framework unifies 15 HGNN architectures, identifies information lost by clique expansion, and is validated experimentally on a node classification suite of real-world hypergraphs where the wall predicts failure of graph-reduction baselines.
Significance. If the result holds, the work provides a fundamental, parameter-free characterization of HGNN limits via homomorphism densities and hypertree width. The unification of 15 architectures and precise identification of clique-expansion losses are notable strengths, as is the motivation for density-aware models. The experimental validation on real hypergraphs adds practical relevance by showing when the theoretical wall manifests in application node classification.
major comments (1)
- [Main theorem on invariant generation and Width Wall] The central derivation (combining classical homomorphism-count completeness with the invariant-approximation step) lacks explicit error bounds or full derivation details for the approximation. This is load-bearing for the claim that homomorphism densities generate all continuous invariants and that the resulting hypertree-width hierarchy is strict for every fixed-depth HGNN architecture.
minor comments (1)
- [Abstract] The abstract capitalizes 'APPLICATION NODE CLASSIFICATION SUITE' without defining it; clarify whether this is a named benchmark or rephrase for readability.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the paper's contributions, including the unification of 15 HGNN architectures, the identification of clique-expansion losses, and the experimental validation on real-world hypergraphs. We address the major comment below and will revise the manuscript accordingly to strengthen the central derivation.
read point-by-point responses
-
Referee: [Main theorem on invariant generation and Width Wall] The central derivation (combining classical homomorphism-count completeness with the invariant-approximation step) lacks explicit error bounds or full derivation details for the approximation. This is load-bearing for the claim that homomorphism densities generate all continuous invariants and that the resulting hypertree-width hierarchy is strict for every fixed-depth HGNN architecture.
Authors: We agree that the proof of the main theorem would benefit from expanded details. In the revised manuscript we will provide a complete, self-contained derivation that first recalls the classical homomorphism-count completeness theorem and then explicitly constructs the invariant-approximation step. We will derive explicit error bounds using the uniform continuity of continuous invariants on the compact space of hypergraphs with bounded degree and size; these bounds will quantify the approximation error in terms of the modulus of continuity and the sampling density of the homomorphism densities. The revised proof will thereby rigorously establish both that homomorphism densities generate all continuous hypergraph invariants and that the induced hypertree-width hierarchy is strict for any fixed-depth HGNN, independent of hidden dimension or training procedure. revision: yes
Circularity Check
Derivation self-contained via classical results
full rationale
The paper combines classical homomorphism-count completeness with an invariant-approximation step to show that homomorphism densities generate all continuous hypergraph invariants and organize them into a strict hypertree-width hierarchy. No equations reduce the central claims (Width Wall, expressivity limits) to fitted parameters, self-citations, or definitional equivalences. The hierarchy is derived from standard expressivity theory applied to hypergraphs without internal reduction to the paper's own inputs or prior author work. Experimental validation is presented separately and does not bear the proof load.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Homomorphism densities generate all continuous hypergraph invariants
- ad hoc to paper The resulting hierarchy is strict and applies to all fixed-depth HGNN architectures
invented entities (1)
-
Width Wall
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization Algorithms on Matrix Manifolds
P.-A. Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008
2008
-
[2]
Ranking via Sinkhorn Propagation
Ryan P. Adams and Richard S. Zemel. Ranking via sinkhorn propagation. InarXiv preprint arXiv:1106.1925, 2011
work page internal anchor Pith review Pith/arXiv arXiv 1925
-
[3]
HyperSAGE: Gen- eralizing inductive representation learning on hypergraphs
Devanshu Arya, Deepak Kumar Gupta, Stevan Rudinac, and Marcel Worring. HyperSAGE: Gen- eralizing inductive representation learning on hypergraphs. InarXiv preprint arXiv:2010.04558, 2020
-
[4]
Parameter-free hypergraph neural network for few-shot node classification.Advances in Neural Information Processing Systems (NeurIPS), 2025
Chaewoon Bae, Doyun Choi, Jaehyun Lee, and Jaemin Yoo. Parameter-free hypergraph neural network for few-shot node classification.Advances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[5]
Bartlett, Dylan J
Peter L. Bartlett, Dylan J. Foster, and Matus J. Telgarsky. Spectrally-normalized margin bounds for neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2017
2017
-
[6]
Networks beyond pairwise interactions: Structure and dynamics.Physics Reports, 874:1–92, 2020
Federico Battiston, Giulia Cencetti, Iacopo Iacopini, Vito Latora, Maxime Lucas, Alice Patania, Jean-Gabriel Young, and Giovanni Petri. Networks beyond pairwise interactions: Structure and dynamics.Physics Reports, 874:1–92, 2020
2020
-
[7]
Benson, David F
Austin R. Benson, David F. Gleich, and Jure Leskovec. Higher-order organization of complex networks.Science, 353(6295):163–166, 2016
2016
-
[8]
Weisfeiler and Lehman go topological: Message passing simplicial networks
Cristian Bodnar, Fabrizio Frasca, Yu Guang Wang, Nina Otter, Guido Montufar, Pietro Liò, and Michael Bronstein. Weisfeiler and Lehman go topological: Message passing simplicial networks. InInternational Conference on Machine Learning (ICML), 2021
2021
-
[9]
Color refinement, homomorphisms, and hypergraphs
Jan Böker. Color refinement, homomorphisms, and hypergraphs. InGraph-Theoretic Concepts in Computer Science (WG), volume 11789 ofLNCS, pages 338–350, 2019. doi: 10.1007/ 978-3-030-30786-8_26
2019
-
[10]
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[11]
An optimal lower bound on the number of variables for graph identification.Combinatorica, 12(4):389–410, 1992
Jin-Yi Cai, Martin Fürer, and Neil Immerman. An optimal lower bound on the number of variables for graph identification.Combinatorica, 12(4):389–410, 1992
1992
-
[12]
A recursive theta body for hypergraphs.Combinatorica, 43:909–938, 2023
Davi Castro-Silva, Fernando Mário de Oliveira Filho, Lucas Slot, and Frank Vallentin. A recursive theta body for hypergraphs.Combinatorica, 43:909–938, 2023
2023
-
[13]
Hubert Chan, Anand Louis, Zhihao Gavin Tang, and Chenzi Zhang
T.-H. Hubert Chan, Anand Louis, Zhihao Gavin Tang, and Chenzi Zhang. Spectral properties of hypergraph laplacian and approximation algorithms.Journal of the ACM, 65(3), 2018
2018
-
[14]
You are AllSet: A multiset function framework for hypergraph neural networks
Eli Chien, Chao Pan, Jianhao Peng, and Olgica Milenkovic. You are AllSet: A multiset function framework for hypergraph neural networks. InInternational Conference on Learning Representations (ICLR), 2022
2022
-
[15]
Weisfeiler and Lehman go categorical.arXiv preprint arXiv:2602.06787, 2026
Seongjin Choi, Gahee Kim, and Se-Young Yun. Weisfeiler and Lehman go categorical.arXiv preprint arXiv:2602.06787, 2026
-
[16]
Colbourn and Jeffrey H
Charles J. Colbourn and Jeffrey H. Dinitz.Handbook of Combinatorial Designs. CRC Press, 2nd edition, 2007
2007
-
[17]
All convex invariant functions of Hermitian matrices.Archiv der Mathematik, 8:276–278, 1957
Chandler Davis. All convex invariant functions of Hermitian matrices.Archiv der Mathematik, 8:276–278, 1957
1957
-
[18]
Exploiting group symmetry in semidefinite programming relaxations of the quadratic assignment problem.Mathematical Programming, 122:225–246, 2010
Etienne de Klerk and Renata Sotirov. Exploiting group symmetry in semidefinite programming relaxations of the quadratic assignment problem.Mathematical Programming, 122:225–246, 2010
2010
-
[19]
Lovász meets Weisfeiler and Leman
Holger Dell, Martin Grohe, and Gaurav Rattan. Lovász meets Weisfeiler and Leman. In International Colloquium on Automata, Languages, and Programming (ICALP), 2018. 11
2018
-
[20]
HNHN: Hypergraph networks with hyperedge neurons
Yihe Dong, Will Sawin, and Yoshua Bengio. HNHN: Hypergraph networks with hyperedge neurons. InICML Workshop on Graph Representation Learning and Beyond (GRL+), 2020. arXiv:2006.12278
-
[21]
Sheaf hypergraph networks
Iulia Duta, Giulia Cassarà, Fabrizio Silvestri, and Pietro Liò. Sheaf hypergraph networks. In Advances in Neural Information Processing Systems (NeurIPS), 2023
2023
-
[22]
A measure-theoretic approach to the theory of dense hyper- graphs.Advances in Mathematics, 231(3–4):1731–1772, 2012
Gábor Elek and Balázs Szegedy. A measure-theoretic approach to the theory of dense hyper- graphs.Advances in Mathematics, 231(3–4):1731–1772, 2012
2012
-
[23]
Hypergraph neural networks
Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. Hypergraph neural networks. InAAAI Conference on Artificial Intelligence, 2019
2019
-
[24]
James H. Fowler. Legislative cosponsorship networks in the US House and Senate.Social Networks, 28(4):454–465, 2006
2006
-
[25]
Size-independent sample complexity of neural networks
Noah Golowich, Alexander Rakhlin, and Ohad Shamir. Size-independent sample complexity of neural networks. InProceedings of the 31st Conference on Learning Theory (COLT), 2018
2018
-
[26]
Hypertree decompositions and tractable queries.Journal of Computer and System Sciences, 64(3):579–627, 2002
Georg Gottlob, Nicola Leone, and Francesco Scarcello. Hypertree decompositions and tractable queries.Journal of Computer and System Sciences, 64(3):579–627, 2002
2002
-
[27]
Topological deep learning: Going beyond graph data
Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, et al. Topological deep learning: Going beyond graph data. arXiv preprint arXiv:2206.00606, 2022
-
[28]
UniGNN: a unified framework for graph and hypergraph neural networks
Jing Huang and Jie Yang. UniGNN: a unified framework for graph and hypergraph neural networks. InInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
2021
-
[29]
Universal invariant and equivariant graph neural networks
Nicolas Keriven and Gabriel Peyré. Universal invariant and equivariant graph neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2019
2019
-
[30]
Equivariant hypergraph neural networks
Jinwoo Kim, Saeyoon Oh, Sungjun Cho, and Seunghoon Hong. Equivariant hypergraph neural networks. InEuropean Conference on Computer Vision (ECCV), 2022
2022
-
[31]
Kingma and Jimmy Ba
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInterna- tional Conference on Learning Representations (ICLR), 2015
2015
-
[32]
Kolda and Brett W
Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications.SIAM Review, 51(3):455–500, 2009
2009
-
[33]
Adrian S. Lewis. The convex analysis of unitarily invariant matrix functions.Journal of Convex Analysis, 2:173–183, 1995
1995
-
[34]
Submodular hypergraphs: p-Laplacians, Cheeger inequalities and spectral clustering
Pan Li and Olgica Milenkovic. Submodular hypergraphs: p-Laplacians, Cheeger inequalities and spectral clustering. InInternational Conference on Machine Learning (ICML), 2018
2018
-
[35]
Xiaoyu Li, Guangyu Tang, and Jiaojiao Jiang. Implicit hypergraph neural networks: A sta- ble framework for higher-order relational learning with provable guarantees.arXiv preprint arXiv:2508.09427, 2025
-
[36]
Operations with structures.Acta Mathematica Hungarica, 18(3–4):321–328, 1967
László Lovász. Operations with structures.Acta Mathematica Hungarica, 18(3–4):321–328, 1967
1967
-
[37]
American Mathematical Society, 2012
László Lovász.Large Networks and Graph Limits, volume 60 ofColloquium Publications. American Mathematical Society, 2012
2012
-
[38]
Limits of dense graph sequences.Journal of Combinatorial Theory, Series B, 96(6):933–957, 2006
László Lovász and Balázs Szegedy. Limits of dense graph sequences.Journal of Combinatorial Theory, Series B, 96(6):933–957, 2006
2006
-
[39]
Provably powerful graph networks
Haggai Maron, Heli Ben-Hamu, Hadar Serviansky, and Yaron Lipman. Provably powerful graph networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2019
2019
-
[40]
Invariant and equivariant graph networks
Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. Invariant and equivariant graph networks. InInternational Conference on Learning Representations (ICLR), 2019. 12
2019
-
[41]
Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe
Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. InAAAI Conference on Artificial Intelligence, 2019
2019
-
[42]
Kriege, Martin Grohe, Matthias Fey, and Karsten Borgwardt
Christopher Morris, Yaron Lipman, Haggai Maron, Bastian Rieck, Nils M. Kriege, Martin Grohe, Matthias Fey, and Karsten Borgwardt. Weisfeiler and Leman go machine learning: The story so far.Journal of Machine Learning Research, 24(333):1–59, 2023
2023
-
[43]
On homomorphism indistinguishability and hypertree depth
Benjamin Scheidt. On homomorphism indistinguishability and hypertree depth. InInternational Colloquium on Automata, Languages, and Programming (ICALP), 2024
2024
-
[44]
Counting homomorphisms from hypergraphs of bounded generalised hypertree width: A logical characterisation
Benjamin Scheidt and Nicole Schweikardt. Counting homomorphisms from hypergraphs of bounded generalised hypertree width: A logical characterisation. InInternational Symposium on Mathematical Foundations of Computer Science (MFCS), 2023
2023
-
[45]
A relationship between arbitrary positive matrices and doubly stochastic matrices.Annals of Mathematical Statistics, 35:876–879, 1964
Richard Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices.Annals of Mathematical Statistics, 35:876–879, 1964
1964
-
[46]
Springer, 2nd edition, 2008
Bernd Sturmfels.Algorithms in Invariant Theory. Springer, 2nd edition, 2008
2008
-
[47]
Training-free message passing for learning on hypergraphs
Bohan Tang, Zexi Liu, Keyue Jiang, Siheng Chen, and Xiaowen Dong. Training-free message passing for learning on hypergraphs. InInternational Conference on Learning Representations (ICLR), 2025
2025
-
[48]
Equivariant hyper- graph diffusion neural operators
Peihao Wang, Shenghao Yang, Yunyu Liu, Zhangyang Wang, and Pan Li. Equivariant hyper- graph diffusion neural operators. InInternational Conference on Learning Representations (ICLR), 2023
2023
-
[49]
How powerful are graph neural networks? InInternational Conference on Learning Representations (ICLR), 2019
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InInternational Conference on Learning Representations (ICLR), 2019
2019
-
[50]
HyperGCN: A new method for training graph convolutional networks on hypergraphs
Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Niber, Anand Louis, and Partha Talukdar. HyperGCN: A new method for training graph convolutional networks on hypergraphs. InAdvances in Neural Information Processing Systems (NeurIPS), 2019
2019
-
[51]
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabás Póczos, Ruslan Salakhutdinov, and Alexander J. Smola. Deep sets. InAdvances in Neural Information Processing Systems (NeurIPS), 2017
2017
-
[52]
Improved expressivity of hypergraph neural networks through high-dimensional generalized Weisfeiler- Leman algorithms
Detian Zhang, Chengqiang Zhang, Yanghui Rao, Qing Li, and Chunjiang Zhu. Improved expressivity of hypergraph neural networks through high-dimensional generalized Weisfeiler- Leman algorithms. InInternational Conference on Machine Learning (ICML), volume 267 of PMLR, pages 76880–76908, 2025
2025
-
[53]
Hypergraph limits: A regularity approach.Random Structures & Algorithms, 47 (2):205–226, 2015
Yufei Zhao. Hypergraph limits: A regularity approach.Random Structures & Algorithms, 47 (2):205–226, 2015. 13 A Proofs This appendix collects the proofs of all results stated in the main text. The algebra-structure statement and completeness theorem (§2.3) are proved in §A.A.1–A.2; the strict hierarchy (Theorem 3.6) and its infinite refinement are in §A.4...
2015
-
[54]
✓” entries satisfy conditions (C1)–(C5). The “△
for all A′ 2 ∈ O(A2). Applying the Reynolds operator: R(q)(A1) =q(A 1) (since q is already correct at A1 after averaging), andR(q)(A 2)is the average ofqoverO(A 2). More precisely, since O(A1) and O(A2) are finite disjoint sets in Rd, there exists a polynomial q taking the value 1 on O(A1) and 0 on O(A2) (by Lagrange interpolation in high enough degree). ...
-
[55]
Identical clique expansions:Both cover every pair exactly once, so ϕ(A1) =ϕ(A 2) = J−I(=K 13)
-
[56]
Since the Pasch count is an isomorphism invariant, the two systems are provably non-isomorphic
Non-isomorphic: H1 has 13 Pasch configurations and automorphism group Z13 ⋊ Z3 (order 39); H2 has 8 Pasch configurations and a smaller automorphism group. Since the Pasch count is an isomorphism invariant, the two systems are provably non-isomorphic
-
[57]
F.2 The CFI Pair (Native vs
Separated by elementary invariant: ΘA1(A1) = 156>Θ A1(A2), since no permutation mapsH 2 toH 1. F.2 The CFI Pair (Native vs. All) We adapt the Cai–Fürer–Immerman construction [11] to 3-uniform hypergraphs. Let G= (V G, EG) be a connected 3-regular base graph with girth >2k+ 1 . For each edge e={u, v} , introduce auxiliary nodesa 1 e, a2 e and hyperedges: T...
-
[58]
For datasets with large hyperedges (¯k≥20 , e.g., House), we use mean normalization: m(ℓ) e ←m (ℓ) e /|e| and ˜h(ℓ) v ← ˜h(ℓ) v /deg(v)
AllDeepSets backbone.A two-layer message-passing network with set-function aggregation: h(0) =W inxv,(8) m(ℓ) e =P u∈e h(ℓ−1) u , ˜h(ℓ) v =P e∋v m(ℓ) e ,(9) h(ℓ) v = LN h(ℓ−1) v + MLP(ℓ) v (MLP(ℓ) e (˜h(ℓ) v )) ,(10) where each MLP is a two-layer network with ReLU and dropout, and LN denotes LayerNorm. For datasets with large hyperedges (¯k≥20 , e.g., Hou...
-
[59]
, ˆt(FP ,star(v))] via Monte Carlo sampling over the star neighborhood star(v) = {e∈ E:v∈e} , using 200 Monte Carlo samples per node (selected from grid {100,200,500} ; Table 10)
Density branch.For each node v, we estimate local pattern densities ρv = [ˆt(F1,star(v)), . . . , ˆt(FP ,star(v))] via Monte Carlo sampling over the star neighborhood star(v) = {e∈ E:v∈e} , using 200 Monte Carlo samples per node (selected from grid {100,200,500} ; Table 10). The pattern library {F1, . . . , FP } is selected adaptively by the DATASETPROFIL...
-
[60]
Per-node gate.A learned gate controls the density contribution per node: gv =σ(w ⊤h(L) v +b g),(11) where σ is the sigmoid function,h(L) v is the backbone embedding, andbg is initialized togate_init— a hyperparameter set by the profiler: • Higher-order (¯k≥7):b g = 0.0(density active from epoch 0) • Mixed (4≤ ¯k <7):b g =−2.0(density partially suppressed)...
-
[61]
equation (11)).The gated density embedding is concatenated with the backbone output: ˆyv = MLPout h(L) v ∥g v ·z v ,(12) whereMLP out :R 2dh →R C is a two-layer classifier
Concatenation fusion (cf. equation (11)).The gated density embedding is concatenated with the backbone output: ˆyv = MLPout h(L) v ∥g v ·z v ,(12) whereMLP out :R 2dh →R C is a two-layer classifier. 27
-
[62]
During the frozen phase, the model trains as a pure AllDeepSets backbone
Staged unfreezing.To prevent early noise from the density branch corrupting the backbone, we freeze the gate and density projection for an initial period: higher-order datasets unfreeze at epoch 0; mixed at epoch 30; pairwise at epoch 50. During the frozen phase, the model trains as a pure AllDeepSets backbone. H.2 Training Recipe All datasets share the s...
-
[63]
Our work synthesizes these into a complete HGNN expressivity characterization
refined this to hypertree depth. Our work synthesizes these into a complete HGNN expressivity characterization. GNN expressivity and the WL hierarchy.Xu et al. [49] and Morris et al. [41] established the connection between message-passing GNN expressivity and 1-WL. Higher-order networks match k-WL [ 39, 40]. For hypergraphs, Böker [9] extended color refin...
-
[64]
edge contribution
established the Deep Sets foundation. Duta et al. [21] introduced sheaf-theoretic structure. Topological and geometric deep learning.Bodnar et al. [8] extended message passing to simplicial complexes; Hajij et al. [27] proposed a general topological framework. Bronstein et al. [10] proposed a group-theoretic design framework. Our work instantiates this fo...
-
[65]
r∗ ≥2 is necessary
decomposes into pattern densities of ghtw = O(ℓ) via (14), approximation at hierarchy level Hr can estimate moments up to degree L≈c·r . Combining with Theorem K.5: widthr=⇒spectral gap accuracyϵ≈O Λ r (up to logarithmic factors). This is exactly the Jackson-type width–accuracy relationship flagged as open in §6, instantiated for the Hodge spectral gap. I...
-
[66]
Instead, it reduces Senate-Bills accuracy by 5.4 percentage points (90.5%→85.1% ) butimprovesHouse by 4.5 points
MEAN normalization hurts small-edge datasets.We expected normalizing density features by edge count (replacing sum-pooling with mean-pooling over MC samples) to improve performance by removing scale dependence. Instead, it reduces Senate-Bills accuracy by 5.4 percentage points (90.5%→85.1% ) butimprovesHouse by 4.5 points. The explanation is structural: S...
-
[67]
A grid search over hidden ∈ {128,256} and J∈ {4,8} yielded at best 80.1%
Cora-CA tuning does not transfer from Senate.Applying the Senate-Bills recipe (hidden =128, J=4, dropout=0.2) to Cora-CA produced 79.7±0.7% , marginallybelowthe default configuration (80.0±1.1% ). A grid search over hidden ∈ {128,256} and J∈ {4,8} yielded at best 80.1%. The lesson: Cora-CA’s small edges (¯k= 4.28 ) leave little room for invariant-based fe...
-
[68]
Additive fusion constrains density features to the same embedding space as the backbone, limiting the expressivity of the combined representation
Additive fusion underperforms concatenation.An early DENSNET-D variant used additive fusion (hv +g v ·z density v ) instead of the concatenation-based design of Equation 5 in the main text. Additive fusion constrains density features to the same embedding space as the backbone, limiting the expressivity of the combined representation. Switching to concate...
-
[69]
AllSetTransformer similarly underperformed on the Multi-Order IWS task: 69.7% (vs
AllDeepSets and AllSetTransformer collapse on structured data.In an early baseline sweep (v3 code, different hyperparameters), AllDeepSets achieved only 50.2% on House, near chance for a binary task; the final tuned result in Table 1 is higher ( 67.8%). AllSetTransformer similarly underperformed on the Multi-Order IWS task: 69.7% (vs. 100% for HNHN/UniGNN...
-
[70]
indifferent
Gene-Disease: density lift revealed by strict ablation ( +4.3%).Gene-Disease ( n=5,012, m=2,009, ¯k=14, 21 classes) was initially classified as “indifferent” based on the original ablation (AllDeepSets at hidden=64 scoring 86.1% vs. DENSNET-D at hidden =128 scoring 86.0%). The strict ablation (Table 4) reveals the true picture: AllDeepSets at theidentical...
-
[71]
On CE-Hard this is expected (any native architecture suffices)
Hierarchy scaling curve is flat.Running PDN with pattern libraries restricted to vmax ∈ {3,4,5,6} on all three IWS benchmarks yields 100% at every level. On CE-Hard this is expected (any native architecture suffices). On Native-Hard, AllDeepSets alone reaches only 92.4%, so the density branchiscontributing—but even the simplest patterns ( vmax=3) suffice ...
-
[72]
Sinkhorn alignment is prohibitive at scale.INVNETPDN on Gene-Disease ( n=5,012) achieves only 64.7% (single seed), far below every baseline including MLP. The Sinkhorn alignment branch requires O(n2) pairwise distance computation and iterative normalization; at this scale, the opti- mization landscape becomes intractable and the invariant features degener...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.