pith. machine review for the scientific record. sign in

arxiv: 2604.09419 · v1 · submitted 2026-04-10 · 💻 cs.LG · cs.DC

Recognition: 1 theorem link

· Lean Theorem

NOMAD: Generating Embeddings for Massive Distributed Graphs

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:11 UTC · model grok-4.3

classification 💻 cs.LG cs.DC
keywords graph embeddingsdistributed computingMPILINE algorithmscalabilitynode2veclarge-scale graphsnetwork embedding
0
0 comments X

The pith

NOMAD distributes LINE-based graph embedding across machines with MPI to handle massive graphs at 10-100x speed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that graph embedding methods like LINE can be scaled to graphs with billions of edges by moving the computation into distributed memory using the message passing interface. Single-node approaches run out of memory or take too long on such graphs, which are common in web and scientific data. NOMAD adds targeted adjustments that cut the amount of data sent between machines during the random-walk sampling process. These changes produce large measured speedups on real hardware while the resulting node vectors remain close in quality to those from standard single-node and other distributed tools. If correct, this removes a practical barrier for applying structure-preserving embeddings to networks that previously could not be processed at all.

Core claim

NOMAD is a distributed-memory framework that implements the proximity-based embedding models of the LINE algorithm using MPI. It introduces several practical trade-offs that lower communication volume for irregular graphs, yielding median speedups of 10-100x versus multi-threaded LINE and node2vec, 35-76x versus distributed PBG, and 12-370x end-to-end on real-world graphs while producing embeddings of competitive quality.

What carries the argument

The NOMAD framework, which runs proximity-based embedding models from LINE under MPI message passing and applies practical trade-offs that reduce inter-node communication for irregular distributed graphs.

Load-bearing premise

The practical adjustments that cut communication volume still capture the same node proximity information as the original models and do not systematically distort the learned vectors on the tested graphs.

What would settle it

Running NOMAD and the original single-node LINE on an identical large graph and finding that downstream task performance using NOMAD embeddings is substantially worse.

Figures

Figures reproduced from arXiv: 2604.09419 by Aishwarya Sarkar, Ali Jannesari, Nathan R. Tallent, Sayan Ghosh.

Figure 1
Figure 1. Figure 1: With sync: processes observe up-to-date context. Without sync: delayed updates result in stale context, affecting quality. III. RELATED WORK PecanPy [30] and GRAPE [4] improve within-node sam￾pling and training efficiency, while GraphVite [3] com￾bines CPU sampling with GPU training for high single-node [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of NOMAD with LINE and node2vec. 1 2 4 8 16 32 #Nodes 10 1 10 2 10 3 Avg execution time 35x 64x 68x 76x 66x 63x Youtube PBG NOMAD 1 2 4 8 #Nodes 41x 42x 44x 53x Physics [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: End-to-end time of NOMAD variants on ogbn-arxiv variant is excluded since reuse/refresh-spill variants are more appropriate). We observed the same qualitative ordering of variants on other smaller inputs. The results clearly indicate that both refresh-spill and the augmented single walk ver￾sions incur the highest execution times: at least an order of magnitude higher than the rest. Refresh-spill generates… view at source ↗
Figure 4
Figure 4. Figure 4: In general, Fig. 6 shows that I [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Scalability of NOMAD on 13 diverse datasets (Table II) up to 16K processes. Top: AllReduce; Bottom: Ibarrier. 16 32 64 128 256 512 1024 2048 4096 #Processes 10 2 10 3 10 4 10 5 10 6 10 7 10 8 # Positive samples across processes ogbn-products ogbn-proteins hyperlink2012 reddit youtube AR-Local AR-Remote (Reuse-Spill) IB-Local / IB-Remote (Reuse-Spill) [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Distribution of positive samples across processes for AR-Local and AR-Remote (Reuse-Spill) on large datasets. Red markers denote IB-Local and IB-Remote (Reuse-Spill), which show little or no variation across processes. 2048 4096 8192 0 50 100 % of total time Sample time (light) Train time (dark) twitter7 friendster hyperlink-2012 ogbn-papers100M ogbn-products ogbn-proteins AR-Local IB-Remote (Aug-Pair) 204… view at source ↗
Figure 8
Figure 8. Figure 8: Breakdown of end-to-end execution time. tion and load imbalance ( [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: 2 MB huge pages on IB-Remote (Aug-Pair). Top: sampling/training breakdown. Bottom: end-to-end time. We also develop an MPI-3.0 Remote Memory Access (RMA) variant of NOMAD using passive-target synchroniza￾tion and the unified memory model [46]. Unlike the de￾fault collective version, which couples communication and synchronization, RMA separates them through one-sided get and accumulate operations with flus… view at source ↗
Figure 10
Figure 10. Figure 10: RMA variants increases load imbalance for Refresh-Spill (depicted by error bars): comparing regular against RMA for Refresh-Spill and Aug-Pair [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
read the original abstract

Successful machine learning on graphs or networks requires embeddings that not only represent nodes and edges as low-dimensional vectors but also preserve the graph structure. Established methods for generating embeddings require flexible exploration of the entire graph through repeated use of random walks that capture graph structure with samples of nodes and edges. These methods create scalability challenges for massive graphs with millions-to-billions of edges because single-node solutions have inadequate memory and processing capabilities. We present NOMAD, a distributed-memory graph embedding framework using the Message Passing Interface (MPI) for distributed graphs. NOMAD implements proximity-based models proposed in the widely popular LINE (Large-scale Information Network Embedding) algorithm. We propose several practical trade-offs to improve the scalability and communication overheads confronted by irregular and distributed graph embedding methods, catering to massive-scale graphs arising in web and science domains. NOMAD demonstrates median speedups of 10/100x on CPU-based NERSC Perlmutter cluster relative to the popular reference implementations of multi-threaded LINE and node2vec, 35-76x over distributed PBG, and competitive embedding quality relative to LINE, node2vec, and GraphVite, while yielding 12-370x end-to-end speedups on real-world graphs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript presents NOMAD, an MPI-based distributed-memory framework for node embeddings on massive graphs (millions to billions of edges). It implements proximity-based models from the LINE algorithm, introduces practical trade-offs to reduce communication overhead for irregular distributed graphs, and reports median speedups of 10/100x over multi-threaded LINE and node2vec, 35-76x over distributed PBG, competitive embedding quality versus LINE/node2vec/GraphVite, and 12-370x end-to-end speedups on real-world graphs evaluated on the NERSC Perlmutter cluster.

Significance. If the empirical results hold, the work addresses a key scalability barrier for graph ML on web- and science-scale networks that exceed single-node memory/processing limits. Strengths include direct performance comparisons against standard reference implementations, explicit validation that the communication-reducing trade-offs preserve embedding fidelity on the evaluated graphs, and quantified end-to-end gains that demonstrate practical impact.

minor comments (3)
  1. The abstract states 'competitive embedding quality' and 'preserve the structural properties' but does not name the concrete metrics (e.g., AUC for link prediction, accuracy for node classification) or the exact graphs used for the 12-370x claim; these should be stated explicitly in the abstract and §4.
  2. The description of the practical trade-offs for communication reduction is central to the scalability argument; adding a short pseudocode listing or parameter table in the methods section would improve reproducibility without lengthening the paper.
  3. Speedup figures are given as medians; reporting the number of runs, standard deviation or error bars, and the precise graph partitioning strategy used on Perlmutter would allow readers to assess variability of the 10-100x and 35-76x claims.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation for minor revision. The assessment accurately summarizes NOMAD's contributions to scalable distributed graph embeddings, including the reported speedups, communication optimizations, and competitive embedding quality on large real-world graphs. We appreciate the recognition of its practical impact for web- and science-scale networks.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents an engineering framework (NOMAD) that implements the existing LINE proximity objective on distributed graphs using MPI, along with practical communication-reducing trade-offs. All central claims are empirical measurements of runtime speedups and embedding quality against external baselines (multi-threaded LINE, node2vec, PBG, GraphVite) on real-world graphs. No derivation chain, first-principles prediction, or fitted parameter is presented that reduces to its own inputs by construction. Self-citations, if any, are limited to referencing the original LINE paper and are not load-bearing for the reported results. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated assumption that the chosen trade-offs maintain embedding fidelity.

pith-pipeline@v0.9.0 · 5524 in / 1139 out tokens · 101682 ms · 2026-05-10T18:11:49.452072+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 30 canonical work pages · 2 internal anchors

  1. [1]

    Pytorch-biggraph: A large scale graph embedding system,

    A. Lerer, L. Wu, J. Shen, T. Lacroix, L. Wehrstedt, A. Bose, and A. Peysakhovich, “Pytorch-biggraph: A large scale graph embedding system,”Proceedings of Machine Learning and Systems, vol. 1, pp. 120– 131, 2019. [Online]. Available: https://proceedings.mlsys.org/paper files /paper/2019/hash/1eb34d662b67a14e3511d0dfd78669be-Abstract.html

  2. [2]

    Efficient graph computation for node2vec,

    D. Zhou, S. Niu, and S. Chen, “Efficient graph computation for node2vec,”CoRR, vol. abs/1805.00280, 2018. [Online]. Available: http://arxiv.org/abs/1805.00280

  3. [3]

    Graphvite: A high- performance cpu-gpu hybrid system for node embedding,

    Z. Zhu, S. Xu, J. Tang, and M. Qu, “Graphvite: A high- performance cpu-gpu hybrid system for node embedding,” inThe world wide web conference, 2019, pp. 2494–2504. [Online]. Available: https://doi.org/10.1145/3308558.3313508

  4. [4]

    Grape for fast and scalable graph processing and random-walk-based embedding,

    L. Cappelletti, T. Fontana, E. Casiraghi, V . Ravanmehr, T. J. Callahan, C. Cano, M. P. Joachimiak, C. J. Mungall, P. N. Robinson, J. Reeseet al., “Grape for fast and scalable graph processing and random-walk-based embedding,”Nature Computational Science, vol. 3, no. 6, pp. 552–568,

  5. [5]

    Available: https://doi.org/10.1038/s43588-023-00465-8

    [Online]. Available: https://doi.org/10.1038/s43588-023-00465-8

  6. [6]

    cugraph: Rapids graph analytics library,

    RAPIDS AI, “cugraph: Rapids graph analytics library,” https://github.c om/rapidsai/cugraph, 2026, accessed: 2026-04-01

  7. [7]

    Accelerating graph sampling for graph machine learning using gpus,

    A. Jangda, S. Polisetty, A. Guha, and M. Serafini, “Accelerating graph sampling for graph machine learning using gpus,” inProceedings of the sixteenth European conference on computer systems, 2021, pp. 311–326. [Online]. Available: https://doi.org/10.1145/3447786.3456244

  8. [8]

    Generalized Slow Roll for Tensors

    S. Pandey, L. Li, A. Hoisie, X. S. Li, and H. Liu, “C-saw: A framework for graph sampling and random walk on gpus,” in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2020, pp. 1–15. [Online]. Available: https://doi.org/10.1109/SC41405.2020.00060

  9. [9]

    In: Proceedings of the 29th Symposium on Operating Systems Principles

    P. Gong, R. Liu, Z. Mao, Z. Cai, X. Yan, C. Li, M. Wang, and Z. Li, “gsampler: General and efficient gpu-based graph sampling for graph learning,” inProceedings of the 29th Symposium on Operating Systems Principles, 2023, pp. 562–578. [Online]. Available: https://doi.org/10.1145/3600006.3613168

  10. [10]

    Representation learning on graphs: Methods and applications,

    W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications,”IEEE Data Eng. Bull., vol. 40, pp. 52–74, 2017. [Online]. Available: https://api.semanticscholar.org/Corp usID:3215337

  11. [11]

    Exploiting centrality information with graph convolutions for network representation learning,

    H. Chen, H. Yin, T. Chen, Q. V . H. Nguyen, W.-C. Peng, and X. Li, “Exploiting centrality information with graph convolutions for network representation learning,” in2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019, pp. 590–601. [Online]. Available: https://doi.org/10.1109/ICDE.2019.00059

  12. [12]

    Line: Large-scale information network embedding,

    J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line: Large-scale information network embedding,” inProceedings of the 24th international conference on world wide web, 2015, pp. 1067–1077. [Online]. Available: https://doi.org/10.1145/2736277.2741093

  13. [13]

    Deepwalk: Online learning of social representations,

    B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” inProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710. [Online]. Available: https://doi.org/10.1145/262333 0.2623732

  14. [14]

    node2vec: Scalable Feature Learning for Networks

    A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” inProceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864. [Online]. Available: https://doi.org/10.1145/2939672.2939754

  15. [15]

    Unsupervised graph representation learning with inductive shallow node embedding,

    R. Kiss and G. Sz ˝ucs, “Unsupervised graph representation learning with inductive shallow node embedding,”Complex & Intelligent Systems, vol. 10, no. 5, pp. 7333–7348, 2024. [Online]. Available: https://doi.org/10.1007/s40747-024-01545-6

  16. [16]

    Heterogeneous graph neural network for attribute completion,

    K. Wang, Y . Yu, C. Huang, Z. Zhao, and J. Dong, “Heterogeneous graph neural network for attribute completion,”Knowledge-Based Systems, vol. 251, p. 109171, 2022. [Online]. Available: https: //doi.org/10.1016/j.knosys.2022.109171

  17. [17]

    Heterogeneous graph neural networks with weak information,

    C. Na, K. Yang, and X. Sun, “Heterogeneous graph neural networks with weak information,”Expert Systems with Applications, p. 130133,

  18. [18]

    Available: https://doi.org/10.1016/j.eswa.2025.130133

    [Online]. Available: https://doi.org/10.1016/j.eswa.2025.130133

  19. [19]

    Graph convolutional networks for graphs containing missing features,

    H. Taguchi, X. Liu, and T. Murata, “Graph convolutional networks for graphs containing missing features,”Future Generation Computer Systems, vol. 117, pp. 155–168, 2021. [Online]. Available: https: //doi.org/10.1016/j.future.2020.11.016

  20. [20]

    Protein functional properties prediction in sparsely-label ppi networks through regularized non-negative matrix factorization,

    Q. Wu, Z. Wang, C. Li, Y . Ye, Y . Li, and N. Sun, “Protein functional properties prediction in sparsely-label ppi networks through regularized non-negative matrix factorization,”BMC systems biology, vol. 9, no. Suppl 1, p. S9, 2015. [Online]. Available: https: //doi.org/10.1186/1752-0509-9-S1-S9

  21. [21]

    Science290(5500), 2319–2323 (2000) https://doi.org/10.1126/ science.290.5500.2319

    J. B. Tenenbaum, V . d. Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,”science, vol. 290, no. 5500, pp. 2319–2323, 2000. [Online]. Available: https://doi.org/10.1126/science.290.5500.2319

  22. [22]

    Laplacian eigenmaps for dimensionality reduction and data representation,

    M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,”Neural Comput., vol. 15, no. 6, pp. 1373–1396, 2003. [Online]. Available: https://doi.org/10.1162/089976 603321780317

  23. [23]

    Multidimensional scaling,

    M. A. Cox and T. F. Cox, “Multidimensional scaling,” inHandbook of data visualization. Springer, 2008, pp. 315–347. [Online]. Available: https://doi.org/10.1007/978-3-540-33037-0 14

  24. [24]

    Hierarchical stochastic neighbor embedding,

    N. Pezzotti, T. H ¨ollt, B. Lelieveldt, E. Eisemann, and A. Vilanova, “Hierarchical stochastic neighbor embedding,” inComputer graphics forum, vol. 35, no. 3. Wiley Online Library, 2016, pp. 21–30. [Online]. Available: https://doi.org/10.1111/cgf.12878

  25. [25]

    Comparing random walks in graph embedding and link prediction,

    A. Vital Jr, F. N. Silva, and D. R. Amancio, “Comparing random walks in graph embedding and link prediction,”PloS one, vol. 19, no. 11, p. e0312863, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2308.03636

  26. [26]

    Graph embedding techniques, applications, and performance: A survey,

    P. Goyal and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey,”Knowledge-Based Systems, vol. 151, pp. 78–94,

  27. [27]

    Available: https://doi.org/10.1016/j.knosys.2018.03.022

    [Online]. Available: https://doi.org/10.1016/j.knosys.2018.03.022

  28. [28]

    Scalerunner: A fast mpi-based random walk engine for multi-cpu systems,

    F. Willich and H. Meyerhenke, “Scalerunner: A fast mpi-based random walk engine for multi-cpu systems,” inEuropean Conference on Parallel Processing. Springer, 2025, pp. 124–138. [Online]. Available: https://doi.org/10.1007/978-3-031-99872-0 9

  29. [29]

    Finding good approximate vertex and edge partitions is np-hard,

    T. N. Bui and C. Jones, “Finding good approximate vertex and edge partitions is np-hard,”Information Processing Letters, vol. 42, no. 3, pp. 153–159, 1992. [Online]. Available: https: //doi.org/10.1016/0020-0190(92)90140-Q

  30. [30]

    Rethinking design paradigm of graph processing system with a cxl-like memory semantic fabric,

    X. Zhang, Y . Chang, T. Lu, K. Zhang, and M. Chen, “Rethinking design paradigm of graph processing system with a cxl-like memory semantic fabric,” in2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2023, pp. 25–35

  31. [31]

    Distributed representations of words and phrases and their compositionality,

    T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems (NeurIPS), 2013. [Online]. Available: https://proceedings.neurips.cc/paper/2013/hash/9aa 42b31882ec039965f3c4923ce901b-Abstract.html

  32. [32]

    Efficient Estimation of Word Representations in Vector Space

    T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, Y . Bengio and Y . LeCun, Eds., 2013. [Online]. Available: http://arxiv.org/abs/1301.3781

  33. [33]

    Pecanpy: a fast, efficient and parallelized python implementation of node2vec,

    R. Liu and A. Krishnan, “Pecanpy: a fast, efficient and parallelized python implementation of node2vec,”Bioinformatics, vol. 37, no. 19, pp. 3377–3379, 2021. [Online]. Available: https://doi.org/10.1093/bioi nformatics/btab202

  34. [34]

    Hogwild: A lock-free approach to parallelizing stochastic gradient descent,

    B. Recht, C. Re, S. Wright, and F. Niu, “Hogwild: A lock-free approach to parallelizing stochastic gradient descent,”Advances in neural information processing systems, vol. 24, 2011. [Online]. Available: https://proceedings.neurips.cc/paper/2011/hash/218a0aefd1d 1a4be65601cc6ddc1520e-Abstract.html

  35. [35]

    2022 , isbn =

    A. Renz-Wieland, R. Gemulla, Z. Kaoudi, and V . Markl, “Nups: A parameter server for machine learning with non-uniform parameter access,” inProceedings of the 2022 International Conference on Management of Data, 2022, pp. 481–495. [Online]. Available: https://doi.org/10.1145/3514221.3517860

  36. [36]

    Pregel: a system for large-scale graph processing,

    G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: a system for large-scale graph processing,” inProceedings of the 2010 ACM SIGMOD International Conference on Management of data, 2010, pp. 135–146. [Online]. Available: https://doi.org/10.1145/1807167.1807184

  37. [37]

    Scalable node embedding algorithms using distributed sparse matrix operations,

    I. Ranawaka and A. Azad, “Scalable node embedding algorithms using distributed sparse matrix operations,” in2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2024, pp. 1199–1201. [Online]. Available: https://doi.org/10.110 9/IPDPSW63119.2024.00205

  38. [38]

    Distributed graph embedding with information-oriented random walks,

    P. Fang, A. Khan, S. Luo, F. Wang, D. Feng, Z. Li, W. Yin, and Y . Cao, “Distributed graph embedding with information-oriented random walks,”Proc. VLDB Endow., vol. 16, no. 7, pp. 1643–1656,

  39. [39]

    Available: https://doi.org/10.14778/3587136.3587140

    [Online]. Available: https://doi.org/10.14778/3587136.3587140

  40. [40]

    Sparse collective operations for mpi,

    T. Hoefler and J. L. Traff, “Sparse collective operations for mpi,” in2009 IEEE International Symposium on Parallel & Distributed Processing. IEEE, 2009, pp. 1–8. [Online]. Available: https://doi.org/10.1109/IPDPS.2009.5160935

  41. [41]

    Accelerate science on perlmutter with nersc,

    C. Yang and J. Deslippe, “Accelerate science on perlmutter with nersc,” Bulletin of the American Physical Society, vol. 65, 2020. [Online]. Available: https://meetings.aps.org/Meeting/MAR20/Session/D40.9

  42. [42]

    LINE: Large-scale information network embedding (github repository),

    J. Tang, “LINE: Large-scale information network embedding (github repository),” https://github.com/tangjianpku/LINE, 2026, accessed: 2026-02-06

  43. [43]

    SNAP: Stanford network analysis platform (github repository),

    Stanford Network Analysis Project, “SNAP: Stanford network analysis platform (github repository),” https://github.com/snap-stanford/snap, accessed: 2026-02-06

  44. [44]

    Pubmedgraphdataset,

    DGL Contributors, “Pubmedgraphdataset,” https://www.dgl.ai/dgl doc s/generated/dgl.data.PubmedGraphDataset.html, accessed: 2026-04-04

  45. [45]

    Gnn benchmark datasets,

    Shchur, Oleksandr and others, “Gnn benchmark datasets,” https://github .com/shchur/gnn-benchmark#datasets, accessed: 2026-04-04

  46. [46]

    Open graph benchmark: Datasets for machine learning on graphs,

    W. Hu, M. Fey, M. Zitnik, Y . Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, “Open graph benchmark: Datasets for machine learning on graphs,”Advances in neural information processing systems, vol. 33, pp. 22 118–22 133, 2020. [Online]. Available: https://proceedings.neurips. cc/paper/2020/hash/fb60d411a5c5b72b2e7d3527cfc84fd0-Abstract.html

  47. [47]

    Mpi-sws social network datasets,

    Max Planck Institute for Software Systems, “Mpi-sws social network datasets,” https://socialnetworks.mpi-sws.org/, accessed: 2026-04-04

  48. [48]

    SNAP Datasets: Stanford large network dataset collection,

    J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data, Jun. 2014

  49. [49]

    Web data commons hyperlink graph, 2012-08,

    Web Data Commons, “Web data commons hyperlink graph, 2012-08,” https://webdatacommons.org/hyperlinkgraph/2012-08/download.html, accessed: 2026-04-04

  50. [50]

    An implementation and evaluation of the mpi 3.0 one-sided com- munication interface,

    J. Dinan, P. Balaji, D. Buntinas, D. Goodell, W. Gropp, and R. Thakur, “An implementation and evaluation of the mpi 3.0 one-sided com- munication interface,”Concurrency and Computation: Practice and Experience, vol. 28, no. 17, pp. 4385–4404, 2016