pith. machine review for the scientific record. sign in

arxiv: 2604.23324 · v1 · submitted 2026-04-25 · 💻 cs.LG · cs.AI

Recognition: unknown

Layer Embedding Deep Fusion Graph Neural Network

Taihua Xu , Genhao Tian , Jicong Fan , Xibei Yang , Qinghua Zhang , Yun Cui

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:16 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords graph neural networksheterophilyover-smoothingsemi-supervised classificationlayer embedding fusiondual topologynode classification
0
0 comments X

The pith

LEDF-GNN fuses multi-layer embeddings nonlinearly and runs original plus reconstructed topologies in parallel to improve GNN performance on both homophilic and heterophilic graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies two core limits in standard Graph Neural Networks: message passing assumes connected nodes share labels, which breaks on heterophilic graphs, and repeated diffusion amplifies noise leading to over-smoothing as layers deepen. It proposes the Layer Embedding Deep Fusion operator to combine embeddings from multiple layers in a nonlinear way that captures inter-layer relations and slows degradation. At the same time the Dual-Topology Parallel Strategy processes the original graph and a reconstructed version together so the model can adapt its structure and semantics to different homophily levels. Experiments on citation networks and image-derived graphs show the resulting model beats existing baselines in semi-supervised node classification under both homophilic and heterophilic conditions.

Core claim

LEDF-GNN introduces a Layer Embedding Deep Fusion operator that nonlinearly fuses multi-layer embeddings to capture inter-layer dependencies and reduce deep propagation degradation, paired with a Dual-Topology Parallel Strategy that runs the original and reconstructed topologies in parallel for adaptive structure-semantics co-optimization, yielding consistent gains over state-of-the-art methods in semi-supervised classification on citation and image benchmarks under both homophilic and heterophilic regimes.

What carries the argument

The Layer Embedding Deep Fusion (LEDF) operator, which performs nonlinear fusion across layer embeddings, together with the Dual-Topology Parallel Strategy (DTPS) that processes original and reconstructed graph topologies simultaneously.

If this is right

  • LEDF-GNN outperforms state-of-the-art baselines in semi-supervised node classification on citation and image benchmarks.
  • The model maintains performance under both homophilic and heterophilic graph conditions.
  • Nonlinear layer fusion reduces the degradation that normally occurs with increased network depth.
  • Dual-topology processing allows adaptive co-optimization of structure and semantics without manual rewiring.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same fusion operator could be applied to other GNN architectures to extend their effective depth on long-range tasks.
  • Dual-topology processing may reduce reliance on separate graph-reconstruction pre-processing steps in heterophilic settings.
  • The approach opens the possibility of testing whether nonlinear inter-layer fusion lowers sensitivity to hyperparameter choices across GNN families.

Load-bearing premise

The LEDF operator and Dual-Topology Parallel Strategy will capture inter-layer dependencies and adapt to varying homophily without creating new over-smoothing or needing extensive tuning that hurts generalization.

What would settle it

Training deeper versions of LEDF-GNN on additional heterophilic datasets and measuring whether node classification accuracy drops or over-smoothing metrics rise relative to shallower baselines would falsify the central claim if the gains disappear.

Figures

Figures reproduced from arXiv: 2604.23324 by Genhao Tian, Jicong Fan, Qinghua Zhang, Taihua Xu, Xibei Yang, Yun Cui.

Figure 1
Figure 1. Figure 1: Min-Max normalized classification accuracy of ho view at source ↗
Figure 2
Figure 2. Figure 2: The pipeline of LEDF-GNN. ⃝L is the LEDF operator, while ⃝α and ⃝β are the node-wise weight matrices of embedding representations corresponding to the original and reconstructed topologies, respectively. result, node representations and homophilic information are gradually overwhelmed by a large amount of heterophilic noise. Therefore, directly correcting the low-quality topol￾ogy of heterophilic graphs is… view at source ↗
Figure 3
Figure 3. Figure 3: The accuracy contrast of LEDF-GNN and its DTPS abla view at source ↗
Figure 5
Figure 5. Figure 5: The accuracy contrast of LEDF-GNN and its valida view at source ↗
Figure 6
Figure 6. Figure 6: The distribution of attention weights across the layers of view at source ↗
Figure 9
Figure 9. Figure 9: The proportion of attention weights in shallow and deep view at source ↗
Figure 7
Figure 7. Figure 7: The distribution of attention weights across the layers of view at source ↗
Figure 8
Figure 8. Figure 8: The proportion of attention weights in shallow and deep view at source ↗
Figure 10
Figure 10. Figure 10: The visualization comparison of the backbones (MLP and GCN) and the backbones with baselines (LEDF-GNN, BORF, ComFy view at source ↗
Figure 11
Figure 11. Figure 11: The visualization comparison of the backbones (MLP and GCN) and the backbones with baselines (LEDF-GNN, BORF, ComFy view at source ↗
Figure 12
Figure 12. Figure 12: The visualization comparison of the backbones (MLP and GCN) and the backbones with baselines (LEDF-GNN, BORF and view at source ↗
Figure 13
Figure 13. Figure 13: The visualization comparison of the backbones (MLP and GCN) and the backbones with baselines (LEDF-GNN, BORF, ComFy view at source ↗
Figure 14
Figure 14. Figure 14: Sensitivity Analysis of two hyperparameters ( view at source ↗
read the original abstract

Graph Neural Networks (GNNs) have demonstrated impressive performance in learning representations from graph-structured data. However, their message-passing mechanism inherently relies on the assumption of label consistency among connected nodes, limiting their applicability to low-homophily settings. Moreover, since message passing operates as a hierarchical diffusion process, GNNs face challenges in capturing long-range dependencies. As network depth increases, the structural noise along heterophilic edges tends to be amplified, resulting in over-smoothing. This issue becomes especially prominent in highly heterophilic graphs, where the propagation of inconsistent semantics across the topology continually exacerbates misaggregation. To address this issue, we propose a novel framework named Layer Embedding Deep Fusion Graph Neural Network (LEDF-GNN). Specifically, we design a Layer Embedding Deep Fusion (LEDF) operator that nonlinearly fuses multi-layer embeddings to capture inter-layer dependencies and effectively alleviate deep propagation degradation. Meanwhile, to mitigate structural heterophily, LEDF-GNN employs a Dual-Topology Parallel Strategy (DTPS) that simultaneously leverages the original and reconstructed topologies, allowing for adaptive structure-semantics co-optimization under diverse homophily conditions. Extensive semi-supervised classification experiments on the citation and image benchmarks demonstrate that, under both homophilic and heterophilic settings, LEDF-GNN consistently outperforms state-of-the-art baselines, validating its effectiveness and generalization capability across diverse graph types.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes Layer Embedding Deep Fusion Graph Neural Network (LEDF-GNN) to address over-smoothing and heterophily limitations in standard GNN message-passing. It introduces a Layer Embedding Deep Fusion (LEDF) operator for nonlinear fusion of multi-layer embeddings to capture inter-layer dependencies, and a Dual-Topology Parallel Strategy (DTPS) that processes original and reconstructed topologies in parallel for adaptive structure-semantics co-optimization. The central claim is that extensive semi-supervised node classification experiments on citation and image benchmarks demonstrate consistent outperformance over state-of-the-art baselines under both homophilic and heterophilic settings.

Significance. If the experimental claims hold with proper ablations and statistical validation, the work could contribute a practical approach to mitigating deep propagation degradation and heterophily-induced misaggregation in GNNs. The dual-topology and deep-fusion ideas target well-known pain points and might generalize across graph types, but the abstract provides no equations, results, or controls to evaluate whether the operators achieve the claimed benefits without new drawbacks such as increased tuning burden or residual smoothing.

major comments (2)
  1. [Abstract] Abstract: the claim that LEDF-GNN 'consistently outperforms state-of-the-art baselines' supplies no equations, ablation results, statistical significance, baseline details, or performance metrics; the central empirical claim therefore cannot be verified from the available text and the experimental design remains opaque.
  2. [Abstract] Abstract: the weakest assumption—that the LEDF operator and DTPS will nonlinearly capture inter-layer dependencies and enable adaptive co-optimization without introducing new over-smoothing or requiring extensive hyperparameter tuning—is stated without any preliminary analysis, complexity discussion, or theoretical justification, which is load-bearing for the motivation and generalization claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and for identifying areas where the abstract could better convey the manuscript's contributions. We address the two major comments point by point below. The full paper contains the requested details on methods, experiments, and analysis; we propose targeted revisions to the abstract to improve transparency without altering its length constraints.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that LEDF-GNN 'consistently outperforms state-of-the-art baselines' supplies no equations, ablation results, statistical significance, baseline details, or performance metrics; the central empirical claim therefore cannot be verified from the available text and the experimental design remains opaque.

    Authors: We agree that the abstract is a high-level summary and therefore omits specific equations, ablation tables, statistical details, and numerical metrics. These elements are fully documented in the manuscript: the LEDF equations appear in Section 3.2, DTPS in Section 3.3, baseline descriptions and experimental protocol in Section 4.1, and results (including means, standard deviations from repeated runs, and comparisons across homophilic/heterophilic settings) in Tables 1–3 and Section 4.2. We will revise the abstract to include a concise statement of the average performance gains and a note that comprehensive ablations and statistical validation are provided in the experimental section. revision: yes

  2. Referee: [Abstract] Abstract: the weakest assumption—that the LEDF operator and DTPS will nonlinearly capture inter-layer dependencies and enable adaptive co-optimization without introducing new over-smoothing or requiring extensive hyperparameter tuning—is stated without any preliminary analysis, complexity discussion, or theoretical justification, which is load-bearing for the motivation and generalization claims.

    Authors: The abstract condenses the motivation; the preliminary analysis of over-smoothing and heterophily appears in the Introduction, the nonlinear fusion rationale and inter-layer dependency capture are justified in Section 3.2, the adaptive co-optimization via DTPS is explained in Section 3.3, and complexity (linear in depth) plus empirical checks for residual smoothing are discussed in Sections 3.4 and 4.3. We acknowledge the abstract could better signal these supporting elements and will add a brief clause referencing the design motivations and validation that no new smoothing or excessive tuning burden is introduced. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes the LEDF operator and DTPS strategy as novel architectural components to address over-smoothing and heterophily in GNNs. The derivation consists of defining these operators to nonlinearly fuse embeddings and leverage dual topologies, followed by empirical validation on citation and image benchmarks. No load-bearing step reduces a claimed prediction or result to a fitted input by construction, no self-citation chain justifies a uniqueness theorem, and no ansatz is smuggled via prior work. The central claims rest on independent experimental outperformance rather than self-referential re-derivation, making the chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

Based on abstract only. The framework rests on standard GNN message-passing assumptions plus two newly introduced operators whose internal mechanics and parameter counts are unspecified.

axioms (2)
  • domain assumption Message-passing mechanism inherently relies on label consistency among connected nodes
    Explicitly stated as the core limitation of GNNs in the abstract.
  • domain assumption Structural noise along heterophilic edges is amplified with network depth
    Presented as the cause of over-smoothing in deep propagation.
invented entities (2)
  • Layer Embedding Deep Fusion (LEDF) operator no independent evidence
    purpose: Nonlinearly fuses multi-layer embeddings to capture inter-layer dependencies
    Newly proposed component; no independent evidence or prior reference provided.
  • Dual-Topology Parallel Strategy (DTPS) no independent evidence
    purpose: Simultaneously leverages original and reconstructed topologies for adaptive co-optimization
    Newly proposed component; no independent evidence or prior reference provided.

pith-pipeline@v0.9.0 · 5552 in / 1462 out tokens · 109261 ms · 2026-05-08T08:16:23.387524+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 5 canonical work pages · 2 internal anchors

  1. [1]

    Make heterophilic graphs better fit gnn: A graph rewiring approach.IEEE Transactions on Knowledge and Data Engineering, 2024

    Wendong Bi, Lun Du, Qiang Fu, Yanlin Wang, Shi Han, and Dongmei Zhang. Make heterophilic graphs better fit gnn: A graph rewiring approach.IEEE Transactions on Knowledge and Data Engineering, 2024. 1

  2. [2]

    Combining labeled and un- labeled data with co-training

    Avrim Blum and Tom Mitchell. Combining labeled and un- labeled data with co-training. InProceedings of the eleventh annual conference on Computational learning theory, pages 92–100, 1998. 6

  3. [3]

    and Wang, Y

    Chen Cai and Yusu Wang. A note on over-smoothing for graph neural networks.arXiv preprint arXiv:2006.13318,

  4. [4]

    A multi-scale approach for graph link prediction

    Lei Cai and Shuiwang Ji. A multi-scale approach for graph link prediction. InProceedings of the AAAI conference on artificial intelligence, pages 3308–3315, 2020. 1

  5. [5]

    Measuring and relieving the over-smoothing problem for graph neural networks from the topological view

    Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. InPro- ceedings of the AAAI conference on artificial intelligence, pages 3438–3445, 2020. 1

  6. [6]

    Simple and deep graph convolutional net- works

    Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, and Yaliang Li. Simple and deep graph convolutional net- works. InInternational conference on machine learning, pages 1725–1735. PMLR, 2020. 2

  7. [7]

    Gbk-gnn: Gated bi-kernel graph neural networks for modeling both homophily and het- erophily

    Lun Du, Xiaozhou Shi, Qiang Fu, Xiaojun Ma, Hengyu Liu, Shi Han, and Dongmei Zhang. Gbk-gnn: Gated bi-kernel graph neural networks for modeling both homophily and het- erophily. InProceedings of the ACM web conference 2022, pages 1550–1558, 2022. 1

  8. [8]

    Predict then propagate: Graph neural networks meet personalized pagerank

    Johannes Gasteiger, Aleksandar Bojchevski, and Stephan G¨unnemann. Predict then propagate: Graph neural networks meet personalized pagerank. InInternational Conference on Learning Representations, 2019. 1, 6

  9. [9]

    Neural message passing for quantum chemistry

    Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational conference on machine learning, pages 1263–1272. Pmlr, 2017. 1

  10. [10]

    Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

    Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017. 2

  11. [11]

    G- mixup: Graph data augmentation for graph classification

    Xiaotian Han, Zhimeng Jiang, Ninghao Liu, and Xia Hu. G- mixup: Graph data augmentation for graph classification. In International conference on machine learning, pages 8230–

  12. [12]

    Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph rep- resentation learning

    Xiaoxin He, Xavier Bresson, Thomas Laurent, Adam Per- old, Yann LeCun, and Bryan Hooi. Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph rep- resentation learning. InICLR, 2024. 6

  13. [13]

    On which nodes does gcn fail? enhancing gcn from the node perspective

    Jincheng Huang, Jialie Shen, Xiaoshuang Shi, and Xiaofeng Zhu. On which nodes does gcn fail? enhancing gcn from the node perspective. InForty-first International Conference on Machine Learning, 2024. 2

  14. [14]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

  15. [15]

    Semi-supervised classi- fication with graph convolutional networks

    Thomas N Kipf and Max Welling. Semi-supervised classi- fication with graph convolutional networks. InInternational Conference on Learning Representations, 2017. 1, 2, 6

  16. [16]

    The mnist database of handwritten digits

    Yann LeCun and Corinna Cortes. The mnist database of handwritten digits. Technical report, ATT Labs, 2010. 6

  17. [17]

    Deepgcns: Making gcns go as deep as cnns.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021

    Guohao Li, Matthias M ¨uller, Guocheng Qian, Itzel Car- olina Delgadillo Perez, Abdulellah Abualshour, Ali Kassem Thabet, and Bernard Ghanem. Deepgcns: Making gcns go as deep as cnns.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. 1

  18. [18]

    Agmixup: Adaptive graph mixup for semi-supervised node classification

    Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang, Yibing Zhan, Yiheng Lu, and Dapeng Tao. Agmixup: Adaptive graph mixup for semi-supervised node classification. InPro- ceedings of the AAAI Conference on Artificial Intelligence, pages 19143–19151, 2025. 6

  19. [19]

    Co-embedding attributed networks

    Zaiqiao Meng, Shangsong Liang, Hongyan Bao, and Xian- gliang Zhang. Co-embedding attributed networks. InPro- ceedings of the twelfth ACM international conference on web search and data mining, pages 393–401, 2019. 6

  20. [20]

    Revisit- ing over-smoothing and over-squashing using ollivier-ricci curvature

    Khang Nguyen, Nong Minh Hieu, Vinh Duc NGUYEN, Nhat Ho, Stanley Osher, and Tan Minh Nguyen. Revisit- ing over-smoothing and over-squashing using ollivier-ricci curvature. 2025. 6

  21. [21]

    Geom-gcn: Geometric graph convolu- tional networks

    Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. Geom-gcn: Geometric graph convolu- tional networks. 2020. 2

  22. [22]

    Dropedge: Towards deep graph convolutional net- works on node classification

    Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. Dropedge: Towards deep graph convolutional net- works on node classification. InInternational Conference on Learning Representations, 2019. 2

  23. [23]

    Multi- scale attributed node embedding.Journal of Complex Net- works, 9(2):cnab014, 2021

    Benedek Rozemberczki, Carl Allen, and Rik Sarkar. Multi- scale attributed node embedding.Journal of Complex Net- works, 9(2):cnab014, 2021. 6

  24. [24]

    Gnns getting comfy: Community and feature sim- ilarity guided rewiring

    Celia Rubio-Madrigal, Adarsh Jamadandi, and Rebekka Burkholz. Gnns getting comfy: Community and feature sim- ilarity guided rewiring. InThe Thirteenth International Con- ference on Learning Representations, 2025. 6

  25. [25]

    Learning representations by back-propagating er- rors.nature, 323(6088):533–536, 1986

    David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning representations by back-propagating er- rors.nature, 323(6088):533–536, 1986. 6

  26. [26]

    Pitfalls of Graph Neural Network Evaluation

    Oleksandr Shchur, Maximilian Mumme, Aleksandar Bo- jchevski, and Stephan G ¨unnemann. Pitfalls of graph neural network evaluation.arXiv preprint arXiv:1811.05868, 2018. 6

  27. [27]

    Graph at- tention networks

    Petar Veli ˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li `o, and Yoshua Bengio. Graph at- tention networks. InInternational Conference on Learning Representations, 2018. 1, 2, 6

  28. [28]

    Heterogeneous graph attention network

    Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Heterogeneous graph attention network. InThe world wide web conference, pages 2022– 2032, 2019. 6

  29. [29]

    Am-gcn: Adaptive multi-channel graph convolu- tional networks

    Xiao Wang, Meiqi Zhu, Deyu Bo, Peng Cui, Chuan Shi, and Jian Pei. Am-gcn: Adaptive multi-channel graph convolu- tional networks. InProceedings of the 26th ACM SIGKDD International conference on knowledge discovery & data mining, pages 1243–1253, 2020. 2

  30. [30]

    Backpropagation through time: what it does and how to do it.Proceedings of the IEEE, 78(10):1550– 1560, 2002

    Paul J Werbos. Backpropagation through time: what it does and how to do it.Proceedings of the IEEE, 78(10):1550– 1560, 2002. 6

  31. [31]

    How Powerful are Graph Neural Networks?

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826, 2018. 6

  32. [32]

    Representa- tion learning on graphs with jumping knowledge networks

    Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. Representa- tion learning on graphs with jumping knowledge networks. InInternational conference on machine learning, pages 5453–5462. PMLR, 2018. 2

  33. [33]

    Link prediction based on graph neural networks.Advances in neural information pro- cessing systems, 31, 2018

    Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks.Advances in neural information pro- cessing systems, 31, 2018. 1

  34. [34]

    An end-to-end deep learning architecture for graph classification

    Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An end-to-end deep learning architecture for graph classification. InProceedings of the AAAI conference on ar- tificial intelligence, 2018. 1

  35. [35]

    Node depen- dent local smoothing for scalable graph learning.Advances in Neural Information Processing Systems, 34:20321–20332,

    Wentao Zhang, Mingyu Yang, Zeang Sheng, Yang Li, Wen Ouyang, Yangyu Tao, Zhi Yang, and Bin Cui. Node depen- dent local smoothing for scalable graph learning.Advances in Neural Information Processing Systems, 34:20321–20332,

  36. [36]

    Pairnorm: Tackling over- smoothing in gnns

    Lingxiao Zhao and Leman Akoglu. Pairnorm: Tackling over- smoothing in gnns. InInternational Conference on Learning Representations, 2020. 1 Layer Embedding Deep Fusion Graph Neural Network Supplementary Material

  37. [37]

    Attention-based Fusion Method To further validate our claim that attention-based layer fu- sion inherently suffers from weight collapse, especially on heterophilic graphs, we conduct a detailed empirical anal- ysis of attention weights (averaging all nodes) distributions across different propagation depths. 6.1. Experimental Setup We evaluate attention-ba...

  38. [38]

    The details are provided in Table 3

    The Details of Dataset In total, fourteen datasets are used in this work, with each being fixedly divided. The details are provided in Table 3. ACM Shallow Deep ACM Cora ACM Cora Wisconsin BlogCatalog Wisconsin BlogCatalog Original Topology Reconstructed Topology Figure 9. The proportion of attention weights in shallow and deep layers of the GCN predictor...

  39. [39]

    Note that for both the ablation and validation experiments, the parameter settings are identical to those used in the semi-supervised node classification experiment

    Parameter Setup Table 4 introduces the parameter settings for the semi- supervised node classification experiments, including the kparameter ofT op k in LSC, as well as the propagation depthsQ1andQ2used in the LEDF operator. Note that for both the ablation and validation experiments, the parameter settings are identical to those used in the semi-supervise...

  40. [40]

    Visualization on Semi-supervised Node Classification Figure 10 illustrates the qualitative comparison on the Cora dataset, showcasing the visual differences among the back- bones (MLP and GCN) themselves, backbones with base- lines (LEDF-GNN, BORF, ComFy and AGMixup). Fig- ure 11 illustrates the qualitative comparison on the ACM dataset, showcasing the vi...

  41. [41]

    Bit operations based LSC needs lower memory overhead than floating-point co- Dataset Nodes Edges Features Class Homo

    LSCvs.Cosine Similarity Table 5 shows that the more intuitive LSC can obtain bet- ter homophily than cosine similarity. Bit operations based LSC needs lower memory overhead than floating-point co- Dataset Nodes Edges Features Class Homo. Train/Valid/Test Cora 2708 10556 1433 7 0.81 140/500/1000 CiteSeer 3327 9104 3703 6 0.74 120/500/1000 PubMed 19717 8864...

  42. [42]

    Accordingly, the runtime and memory overhead reported in Tables 6 also exhibit that our model only needs lightweight overhead beyond the backbone

    Complexity Analysis The time and space complexity of our method beyond the backbone model areO(cm+n)andO(cn), respectively, wherenis node number,mis edge number andcis class number. Accordingly, the runtime and memory overhead reported in Tables 6 also exhibit that our model only needs lightweight overhead beyond the backbone

  43. [43]

    Figure 14 shows the robustness of our method across different settings

    Hyperparameter Analysis We conduct sensitivity analysis on two key hyperparame- ters, depthQand reconstruction hyperparameterK, by us- ing the heterophilic Wisconsin dataset. Figure 14 shows the robustness of our method across different settings

  44. [44]

    JKNet- Mean focus on layer fusion, GCNII and NDLS are deep- GNNs, H 2GCN is the representative heterophily-focused method

    Comparison with Heterophliy-focused and Deep-GNN Baseline The comparison results are reported in Table 7. JKNet- Mean focus on layer fusion, GCNII and NDLS are deep- GNNs, H 2GCN is the representative heterophily-focused method. The datasets of Wisconsin and Chameleon are het- erophilic. w/o + LEDF-GNN + BORF + ComFy + AGMixup MLPGCN Figure 10. The visual...