A Survey and Experimental Analysis of Distributed Subgraph Matching

Jingren Zhou; Kongzhang Hao; Longbin Lai; Lu Qin; Ran Wang; Wenjie Zhang; Xin Jin; Xuemin Lin; Ying Zhang; Zhengmin Lai

arxiv: 1906.11518 · v1 · pith:R5BO3F4Enew · submitted 2019-06-27 · 💻 cs.DB

A Survey and Experimental Analysis of Distributed Subgraph Matching

Longbin Lai , Zhu Qing , Zhengyi Yang , Xin Jin , Zhengmin Lai , Ran Wang , Kongzhang Hao , Xuemin Lin

show 5 more authors

Lu Qin Wenjie Zhang Ying Zhang Zhengping Qian Jingren Zhou

This is my paper

Pith reviewed 2026-05-25 14:07 UTC · model grok-4.3

classification 💻 cs.DB

keywords distributed subgraph matchingstrategy comparisonTimely dataflowexperimental analysisunlabelled matchinglabelled matchingpractical guidegraph query processing

0 comments

The pith

Reimplementing four distributed subgraph matching strategies inside one dataflow system isolates their performance differences and produces a practical selection guide.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extracts four core strategies and three shared optimizations from existing distributed subgraph matching algorithms. It then codes every strategy together with the optimizations on the single Timely dataflow platform so that differences in results can be attributed to strategy rather than platform. Large-scale experiments on both unlabelled and labelled graphs test the variants across many graph sizes, query patterns, and hardware settings. The measured trade-offs are condensed into concrete advice on which strategy to use for a given workload. A sympathetic reader cares because subgraph matching is a fundamental operation in large-scale graph analytics, yet earlier comparisons had mixed strategy effects with unrelated implementation choices.

Core claim

The authors identify four strategies and three general-purpose optimizations from representative state-of-the-art works. They implement the four strategies with the optimizations based on the common Timely dataflow system for systematic strategy-level comparison. Their implementation covers all representation algorithms. They conduct extensive experiments for both unlabelled matching and labelled matching to analyze the performance of distributed subgraph matching under various settings, which is finally summarized as a practical guide.

What carries the argument

Classification of algorithms into four strategies plus three optimizations, all realized inside the same Timely dataflow engine so that strategy effects can be measured in isolation.

If this is right

Performance rankings among the four strategies become directly comparable because platform differences are removed.
The practical guide indicates which strategy wins for particular combinations of graph scale, query shape, and label presence.
The three optimizations improve every strategy, yet their relative value changes with the underlying strategy.
Results apply separately to unlabelled and labelled matching workloads.
The experimental methodology covers enough settings to support workload-specific recommendations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same isolation technique could be applied to other distributed graph primitives such as triangle listing or reachability queries.
If a new strategy appears, it can be inserted into the Timely framework and ranked against the existing four without re-running all prior experiments.
Communication volume and load balance patterns exposed by the common runtime may explain why certain strategies scale better on particular graphs.
The guide could be validated by deploying its recommended strategy on a production workload and measuring end-to-end query latency.

Load-bearing premise

Re-implementing the original algorithms inside Timely dataflow faithfully reproduces their performance characteristics without introducing new biases from the shared runtime.

What would settle it

If the original published implementations of the four strategies, run on identical hardware and inputs, produce different relative performance orderings than the Timely reimplementations, the claim of faithful strategy-level comparison would be falsified.

Figures

Figures reproduced from arXiv: 1906.11518 by Jingren Zhou, Kongzhang Hao, Longbin Lai, Lu Qin, Ran Wang, Wenjie Zhang, Xin Jin, Xuemin Lin, Ying Zhang, Zhengmin Lai, Zhengping Qian, Zhengyi Yang, Zhu Qing.

**Figure 1.** Figure 1: Query Graph Q (Left) and Data Graph G (Right). By treating the query vertices as attributes and data edges as relational table, we can write subgraph matching query as a multiway-way join of the edge relations. For example, regardless of label and order constraints, the query of Example 2.1 can be written as the following join R(Q) = E(v1, v2) on E(v2, v3) on E(v3, v4) on E(v1, v4) on E(v2, v4). (1) This m… view at source ↗

**Figure 2.** Figure 2: Simple BINJOIN Join Plans. To improve the performance of BINJOIN, people devoted their efforts into: (1) using more complex base relations other than edge; (2) devising better join plan P. The base relations B[q] represent the matches of a set of sub-structures [q] of the query graph Q. Each p ∈ [q] is called a join unit, and it must satisfy VQ = S p∈[q] Vp and EQ = S p∈[q] Ep. With the data graph partitio… view at source ↗

**Figure 3.** Figure 3: The Core-Crystal Decomposition of the query graph. With core-crystal decomposition, the computation has accordingly split into three stages: 1. Core computation. Given that Q(V c Q) itself is a query graph, the algorithm can be recursively applied to compute Q(V c Q) according to [45]. 2. Crystal computation. A special case of crystal is Q(x, 1), which is indeed a (x + 1)-clique. Suppose an instance of the… view at source ↗

**Figure 4.** Figure 4: The unlabelled queries. q2 q5 10 0 10 1 10 2 10 3 10 4 >3h T (sec) (a) BINJOIN on US q2 q5 10 0 10 1 10 2 10 3 10 4 >3h T (sec) (b) BINJOIN on LJ q2 q5 10 0 10 1 10 2 10 3 10 4 >3h T (sec) (c) WOPTJOIN on US q2 q5 10 0 10 1 10 2 10 3 10 4 >3h T (sec) (d) WOPTJOIN on LJ [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Effectiveness of optimizations. on US. The reason is that US is a sparse dataset with few room for Compression, while Compression itself incurs extra cost. We also compare BINJOIN with BINJOIN(w.o.c.) on the other sparse graph EU, and the results are the same. For WOPTJOIN strategy, Batching has little impact to the performance. Surprisingly, after using TrIndexing to WOPTJOIN, the improvement by average i… view at source ↗

**Figure 6.** Figure 6: Challenging queries. FULLREP beats all the other strategies, while SHRCUBE fails q8 and q9 on GO because of OT. Although SHRCUBE uses the same local algorithm as FULLREP, it spends a lot of time on deduplication (Section 3.3). We focus on comparing BINJOIN and WOPTJOIN on GO dataset. On the one hand, WOPTJOIN outperforms BINJOIN for q7 and q8. Their join plans of q7 are nearly the same except that BINJOIN … view at source ↗

**Figure 7.** Figure 7: All-around comparisons. FULLREP typically outperforms the other strategies. Observe that WOPTJOIN’s performance is often very close to FULLREP. The reason is that the WOPTJOIN’s computing plans for these evaluated queries are similar to “DualSim” adopted by FULLREP. The extra communication cost of WOPTJOIN has been reduced to very low while adopting TrIndexing optimization. While comparing WOPTJOIN with BI… view at source ↗

**Figure 8.** Figure 8: Labelled queries. Queries. The queries, shown in [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: All-around comparisons of labelled matching. [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗

**Figure 10.** Figure 10: A practical guide of distributed subgraph matching. The unlabelled case is also known as subgraph listing/enumeration, and due to the gigantic (intermediate) results, people have been either seeking scalable algorithms in parallel, or devising techniques to compress the results. Other than the algorithms studied in this paper (Section 3), Kim et al. proposed the external-memory-based parallel algorithm DU… view at source ↗

**Figure 11.** Figure 11: Scalability experiment: querying q1 and q2 on LJ. All strategies demonstrate reasonable scaling regarding both queries. In terms of COST, note that FULLREP is slightly larger than 1, because “DualSim” is implemented in general for arbitrary query, while “SingleThread” uses a hand24 [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 12.** Figure 12: Varying densities of labelled graph. Exp-8 Vary Labels for Labelled Matching. We generate the datasets with number of labels 0, 5, 10, 15 and 20 based on DG10. Note that there are 5 labels in labelled queries q4 and q7, which are called the target labels. The 10-label dataset is the original DG10. For the one with 5 labels, we will replace each label not in the target labels as one random target label. Fo… view at source ↗

**Figure 13.** Figure 13: Varying labels of labelled graph [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗

read the original abstract

Recently there emerge many distributed algorithms that aim at solving subgraph matching at scale. Existing algorithm-level comparisons failed to provide a systematic view to the pros and cons of each algorithm mainly due to the intertwining of strategy and optimization. In this paper, we identify four strategies and three general-purpose optimizations from representative state-of-the-art works. We implement the four strategies with the optimizations based on the common Timely dataflow system for systematic strategy-level comparison. Our implementation covers all representation algorithms. We conduct extensive experiments for both unlabelled matching and labelled matching to analyze the performance of distributed subgraph matching under various settings, which is finally summarized as a practical guide.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a survey that reimplements four existing distributed subgraph matching strategies inside Timely for controlled comparison and ends with a practical guide, but the comparisons stand or fall on whether the reimplementations preserve the originals' performance traits.

read the letter

The paper pulls four strategies and three optimizations out of prior distributed subgraph matching work, ports the strategies to a shared Timely dataflow runtime, runs head-to-head experiments on unlabeled and labeled cases, and turns the results into a guide for choosing among them. The controlled platform is the main practical contribution because it removes the usual confounding of algorithm, optimization, and runtime that earlier comparisons mixed together. The experiments appear broad enough to cover different graph sizes and query patterns, which gives the guide some grounding in data rather than just author opinion. The reimplementation step is the soft spot. The paper treats the Timely versions as faithful stand-ins for the originals, but without checks against the source systems or published performance numbers it is possible that observed differences trace to Timely-specific choices in partitioning or messaging instead of the strategies themselves. The claim that the implementation covers all representation algorithms does not close that gap. This work is for people inside graph databases and large-scale analytics who need evidence on which existing strategy to pick under different conditions. Readers hunting for new algorithms or theory will not find them here. The experimental design is systematic enough to merit referee time even though the guide's reliability depends on the fidelity question.

Referee Report

2 major / 1 minor

Summary. The manuscript identifies four strategies and three general-purpose optimizations from representative state-of-the-art works on distributed subgraph matching. It re-implements the four strategies (with the optimizations) inside the common Timely dataflow system to enable systematic strategy-level comparison, states that the implementation covers all representation algorithms, conducts extensive experiments on both unlabelled and labelled matching under various settings, and distills the results into a practical guide.

Significance. If the re-implementations faithfully preserve the performance-relevant behavior of the original algorithms, the work supplies a valuable platform-independent comparison that existing algorithm-level studies have not provided. The shared Timely runtime is a methodological strength for isolating strategy differences. The empirical scope (unlabelled and labelled cases, multiple settings) and the resulting practical guide would be useful to the community.

major comments (2)

[§4 (Implementation)] §4 (Implementation): the manuscript asserts that the Timely re-implementations cover all representation algorithms but supplies no external validation (e.g., runtime or scalability numbers on shared benchmarks compared against the original published results). This gap is load-bearing because the entire strategy comparison and the derived practical guide rest on the assumption that observed differences reflect the strategies rather than platform-specific artifacts.
[§5 (Experiments)] §5 (Experiments): without fidelity checks on the re-implementations, it is impossible to determine whether performance differences under varying settings (graph size, label density, etc.) arise from the four strategies or from unstated choices in the common Timely runtime such as partitioning, batching, or indexing.

minor comments (1)

[Abstract] The abstract would benefit from naming the four strategies and three optimizations explicitly rather than referring to them only by count.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of the paper's contributions and for the constructive major comments. We address each point below regarding the validation of the Timely re-implementations.

read point-by-point responses

Referee: [§4 (Implementation)] the manuscript asserts that the Timely re-implementations cover all representation algorithms but supplies no external validation (e.g., runtime or scalability numbers on shared benchmarks compared against the original published results). This gap is load-bearing because the entire strategy comparison and the derived practical guide rest on the assumption that observed differences reflect the strategies rather than platform-specific artifacts.

Authors: We agree that external validation against original published results would strengthen confidence in the re-implementations. However, the majority of the source papers do not provide open-source implementations, which precludes direct runtime or scalability comparisons on identical benchmarks and hardware. Our re-implementations are derived strictly from the algorithmic descriptions and pseudocode in the original works, preserving the four core strategies while standardizing the runtime. In the revision we will expand §4 with an explicit discussion of implementation fidelity, including any simplifications made and why they do not alter the strategy-level behavior being compared. revision: partial
Referee: [§5 (Experiments)] without fidelity checks on the re-implementations, it is impossible to determine whether performance differences under varying settings (graph size, label density, etc.) arise from the four strategies or from unstated choices in the common Timely runtime such as partitioning, batching, or indexing.

Authors: All four strategy implementations share the identical Timely dataflow configuration, including the same partitioning scheme, batch sizes, and indexing structures; only the query-processing logic differs according to the extracted strategy. We will revise §4 and §5 to make this shared configuration explicit and to state that any observed performance variation is therefore attributable to the strategies rather than to runtime parameters. We will also add a short paragraph confirming that the same Timely version and configuration flags were used throughout the experimental campaign. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical survey with re-implementations on shared runtime

full rationale

The paper is a survey and experimental analysis that extracts four strategies and three optimizations from prior literature, re-implements them inside a common Timely dataflow system, runs head-to-head experiments on labelled and unlabelled matching, and condenses results into a practical guide. No equations, derivations, or first-principles claims appear. No self-definitional loops, fitted parameters renamed as predictions, load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation are present. The comparison rests on external experimental outcomes rather than reducing to its own inputs by construction. The methodological assumption that re-implementations preserve original performance characteristics is a standard empirical concern but does not instantiate any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is a survey and experimental comparison; it introduces no new mathematical derivations, fitted parameters, axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5663 in / 1071 out tokens · 23205 ms · 2026-05-25T14:07:59.831616+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

[1]

http://www.dis.uniroma1.it/challenge9

The challenge9 datasets. http://www.dis.uniroma1.it/challenge9

work page
[2]

https://lemurproject.org/clueweb12

The clubweb12 dataset. https://lemurproject.org/clueweb12

work page
[3]

https://en.wikipedia.org/wiki/Sparse_matrix

Compressed sparse row. https://en.wikipedia.org/wiki/Sparse_matrix

work page
[4]

http://giraph.apache.org/

Giraph. http://giraph.apache.org/

work page
[5]

https://github.com/frankmcsherry/dataflow-join/

The implementation of bigjoin. https://github.com/frankmcsherry/dataflow-join/

work page
[6]

http://ldbcouncil.org/benchmarks

Ldbc benchmarks. http://ldbcouncil.org/benchmarks

work page
[7]

https://ldbc.github.io/ldbc_snb_docs/ ldbc-snb-specification.pdf

The ldbc social network benchmark. https://ldbc.github.io/ldbc_snb_docs/ ldbc-snb-specification.pdf

work page
[8]

https://github.com/frankmcsherry/timely-dataflow

The open-sourced timely dataﬂow system. https://github.com/frankmcsherry/timely-dataflow

work page
[9]

http://snap.stanford.edu/data/index.html

The snap datasets. http://snap.stanford.edu/data/index.html

work page
[10]

http://law.di.unimi.it/datasets.php

The webgraph datasets. http://law.di.unimi.it/datasets.php

work page
[11]

C. R. Aberger, S. Tu, K. Olukotun, and C. Ré. Emptyheaded: A relational engine for graph processing. SIGMOD ’16, pages 431–446

work page
[12]

F. N. Afrati, D. Fotakis, and J. D. Ullman. Enumerating subgraph instances using map-reduce. In Proc. of ICDE’13, 2013

work page 2013
[13]

Ammar, F

K. Ammar, F. McSherry, S. Salihoglu, and M. Joglekar. Distributed evaluation of subgraph queries using worst- case optimal low-memory dataﬂows. PVLDB, 11(6):691–704, 2018

work page 2018
[14]

Angles, M

R. Angles, M. Arenas, P. Barceló, A. Hogan, J. Reutter, and D. Vrgo ˇc. Foundations of modern query languages for graph databases. ACM Comput. Surv., 50(5), 2017

work page 2017
[15]

Beame, P

P. Beame, P. Koutris, and D. Suciu. Communication steps for parallel query processing. J. ACM, 64(6):40:1– 40:58, 2017

work page 2017
[16]

F. Bi, L. Chang, X. Lin, L. Qin, and W. Zhang. Efﬁcient subgraph matching by postponing cartesian products. SIGMOD ’16, pages 1199–1214, 2016

work page 2016
[17]

Carbone, A

P. Carbone, A. Katsifodimos, Kth, S. Sweden, S. Ewen, V . Markl, S. Haridi, and K. Tzoumas. Apache ﬂink TM: Stream and batch processing in a single engine. 38, 01 2015

work page 2015
[18]

Chakrabarti, Y

D. Chakrabarti, Y . Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. In SDM, 2004

work page 2004
[19]

Choudhury, L

S. Choudhury, L. B. Holder, G. Chin, K. Agarwal, and J. Feo. A selectivity based approach to continuous pattern detection in streaming graphs. In EDBT, 2015

work page 2015
[20]

S. Chu, M. Balazinska, and D. Suciu. From theory to practice: Efﬁcient join query evaluation in a parallel database system. SIGMOD ’15, pages 63–78

work page
[21]

F. R. K. Chung, L. Lu, and V . H. Vu. The spectra of random graphs with given expected degrees. Internet Mathematics, 1(3), 2003

work page 2003
[22]

D. J. DeWitt and J. Gray. Parallel database systems: The future of database processing or a passing fad?SIGMOD Rec., 19(4):104–112

work page
[23]

Erdos and A

P. Erdos and A. Renyi. On the evolution of random graphs. In Publ. Math. Inst. Hungary. Acad. Sci., 1960

work page 1960
[24]

W. Fan, J. Li, J. Luo, Z. Tan, X. Wang, and Y . Wu. Incremental graph pattern matching. SIGMOD ’11, pages 925–936, 2011

work page 2011
[25]

Gorka and R

S. Gorka and R. Philip. Improving ﬁrst-party bank fraud detection with graph databases, 2016

work page 2016
[26]

J. A. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. In Proc. of RECOMB’07, 2007. 22 A PREPRINT - J UNE 28, 2019

work page 2007
[27]

W.-S. Han, J. Lee, and J.-H. Lee. Turboiso: Towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proc. of SIGMOD’13, 2013

work page 2013
[28]

J. He, S. Zhang, and B. He. In-cache query co-processing on coupled cpu-gpu architectures. PVLDB, 8(4):329– 340, 2014

work page 2014
[29]

Herlihy and N

M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. 2008

work page 2008
[30]

Inoue, M

H. Inoue, M. Ohara, and K. Taura. Faster set intersection with simd instructions by reducing branch mispredic- tions. PVLDB, 8(3):293–304, 2014

work page 2014
[31]

Y . E. Ioannidis and Y . C. Kang. Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization. In SIGMOD’91, pages 168–177, 1991

work page 1991
[32]

Junghanns, M

M. Junghanns, M. Kiesling, A. Averbuch, A. Petermann, and E. Rahm. Cypher-based graph pattern matching in gradoop. GRADES’17, pages 3:1–3:8

work page
[33]

Kankanamge, S

C. Kankanamge, S. Sahu, A. Mhedbhi, J. Chen, and S. Salihoglu. Graphﬂow: An active graph database. SIG- MOD ’17, pages 1695–1698

work page
[34]

H. Kim, J. Lee, S. S. Bhowmick, W.-S. Han, J. Lee, S. Ko, and M. H. Jarrah. Dualsim: Parallel subgraph enumeration in a massive graph on a single machine. SIGMOD ’16, pages 1231–1245, 2016

work page 2016
[35]

L. Lai, L. Qin, X. Lin, and L. Chang. Scalable subgraph enumeration in mapreduce. PVLDB, 8(10):974–985, 2015

work page 2015
[36]

L. Lai, L. Qin, X. Lin, and L. Chang. Scalable subgraph enumeration in mapreduce: A cost-oriented approach. The VLDB Journal, 26(3):421–446, June 2017

work page 2017
[37]

L. Lai, L. Qin, X. Lin, Y . Zhang, L. Chang, and S. Yang. Scalable distributed subgraph enumeration. PVLDB, 10(3):217–228, 2016

work page 2016
[38]

Y . Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: A frame- work for machine learning and data mining in the cloud. PVLDB, 5(8):716–727, 2012

work page 2012
[39]

Y . Luo, W. Wang, and X. Lin. Spark: A keyword search engine on relational databases. In ICDE, pages 1552– 1555, April 2008

work page 2008
[40]

Y . Luo, W. Wang, X. Lin, X. Zhou, J. Wang, and K. Li. Spark2: Top-k keyword query in relational databases. IEEE Transactions on Knowledge and Data Engineering, 23(12):1763–1780, Dec 2011

work page 2011
[41]

Malewicz, M

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In Proc. of SIGMOD’10, 2010

work page 2010
[42]

McSherry, M

F. McSherry, M. Isard, and D. G. Murray. Scalability! but at what cost? HOTOS’15, 2015

work page 2015
[43]

D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A timely dataﬂow system. SOSP ’13, pages 439–455, 2013

work page 2013
[44]

H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms. J. ACM, 65(3), 2018

work page 2018
[45]

M. Qiao, H. Zhang, and H. Cheng. Subgraph matching: On compression and computation. PVLDB, 11(2):176– 188, 2017

work page 2017
[46]

Ren and J

X. Ren and J. Wang. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. PVLDB, 8(5):617–628, 2015

work page 2015
[47]

Shamir and D

R. Shamir and D. Tsur. Faster subtree isomorphism. In Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems, pages 126–131, 1997

work page 1997
[48]

Shang, Y

H. Shang, Y . Zhang, X. Lin, and J. X. Yu. Taming veriﬁcation hardness: An efﬁcient algorithm for testing subgraph isomorphism. PVLDB, 1(1):364–375, 2008

work page 2008
[49]

B. Shao, H. Wang, and Y . Li. Trinity: A distributed graph engine on a memory cloud. SIGMOD ’13, pages 505–516, 2013

work page 2013
[50]

Y . Shao, B. Cui, L. Chen, L. Ma, J. Yao, and N. Xu. Parallel subgraph listing in a large-scale graph. In SIGMOD’14, pages 625–636. ACM, 2014

work page 2014
[51]

Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efﬁcient subgraph matching on billion node graphs. PVLDB, 5(9):788–799, 2012

work page 2012
[52]

C. H. C. Teixeira, A. J. Fonseca, M. Seraﬁni, G. Siganos, M. J. Zaki, and A. Aboulnaga. Arabesque: A system for distributed graph mining. SOSP ’15, pages 425–440. 23 A PREPRINT - J UNE 28, 2019

work page 2019
[53]

H. Wei, J. X. Yu, C. Lu, and X. Lin. Speedup graph processing by graph ordering. SIGMOD ’16, pages 1813– 1828

work page
[54]

Single Thread

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In HotCloud’10, pages 10–10. A Auxiliary Experiments Exp-6 Scalability of Unlabelled Matching. We vary the number of machines as 1, 2, 4, 6, 8, 10, and run the unlabelled queriesq1 andq2 to see how each strategy (B INJOIN, WO PTJOIN, S HRCUBE a...

work page 2019

[1] [1]

http://www.dis.uniroma1.it/challenge9

The challenge9 datasets. http://www.dis.uniroma1.it/challenge9

work page

[2] [2]

https://lemurproject.org/clueweb12

The clubweb12 dataset. https://lemurproject.org/clueweb12

work page

[3] [3]

https://en.wikipedia.org/wiki/Sparse_matrix

Compressed sparse row. https://en.wikipedia.org/wiki/Sparse_matrix

work page

[4] [4]

http://giraph.apache.org/

Giraph. http://giraph.apache.org/

work page

[5] [5]

https://github.com/frankmcsherry/dataflow-join/

The implementation of bigjoin. https://github.com/frankmcsherry/dataflow-join/

work page

[6] [6]

http://ldbcouncil.org/benchmarks

Ldbc benchmarks. http://ldbcouncil.org/benchmarks

work page

[7] [7]

https://ldbc.github.io/ldbc_snb_docs/ ldbc-snb-specification.pdf

The ldbc social network benchmark. https://ldbc.github.io/ldbc_snb_docs/ ldbc-snb-specification.pdf

work page

[8] [8]

https://github.com/frankmcsherry/timely-dataflow

The open-sourced timely dataﬂow system. https://github.com/frankmcsherry/timely-dataflow

work page

[9] [9]

http://snap.stanford.edu/data/index.html

The snap datasets. http://snap.stanford.edu/data/index.html

work page

[10] [10]

http://law.di.unimi.it/datasets.php

The webgraph datasets. http://law.di.unimi.it/datasets.php

work page

[11] [11]

C. R. Aberger, S. Tu, K. Olukotun, and C. Ré. Emptyheaded: A relational engine for graph processing. SIGMOD ’16, pages 431–446

work page

[12] [12]

F. N. Afrati, D. Fotakis, and J. D. Ullman. Enumerating subgraph instances using map-reduce. In Proc. of ICDE’13, 2013

work page 2013

[13] [13]

Ammar, F

K. Ammar, F. McSherry, S. Salihoglu, and M. Joglekar. Distributed evaluation of subgraph queries using worst- case optimal low-memory dataﬂows. PVLDB, 11(6):691–704, 2018

work page 2018

[14] [14]

Angles, M

R. Angles, M. Arenas, P. Barceló, A. Hogan, J. Reutter, and D. Vrgo ˇc. Foundations of modern query languages for graph databases. ACM Comput. Surv., 50(5), 2017

work page 2017

[15] [15]

Beame, P

P. Beame, P. Koutris, and D. Suciu. Communication steps for parallel query processing. J. ACM, 64(6):40:1– 40:58, 2017

work page 2017

[16] [16]

F. Bi, L. Chang, X. Lin, L. Qin, and W. Zhang. Efﬁcient subgraph matching by postponing cartesian products. SIGMOD ’16, pages 1199–1214, 2016

work page 2016

[17] [17]

Carbone, A

P. Carbone, A. Katsifodimos, Kth, S. Sweden, S. Ewen, V . Markl, S. Haridi, and K. Tzoumas. Apache ﬂink TM: Stream and batch processing in a single engine. 38, 01 2015

work page 2015

[18] [18]

Chakrabarti, Y

D. Chakrabarti, Y . Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. In SDM, 2004

work page 2004

[19] [19]

Choudhury, L

S. Choudhury, L. B. Holder, G. Chin, K. Agarwal, and J. Feo. A selectivity based approach to continuous pattern detection in streaming graphs. In EDBT, 2015

work page 2015

[20] [20]

S. Chu, M. Balazinska, and D. Suciu. From theory to practice: Efﬁcient join query evaluation in a parallel database system. SIGMOD ’15, pages 63–78

work page

[21] [21]

F. R. K. Chung, L. Lu, and V . H. Vu. The spectra of random graphs with given expected degrees. Internet Mathematics, 1(3), 2003

work page 2003

[22] [22]

D. J. DeWitt and J. Gray. Parallel database systems: The future of database processing or a passing fad?SIGMOD Rec., 19(4):104–112

work page

[23] [23]

Erdos and A

P. Erdos and A. Renyi. On the evolution of random graphs. In Publ. Math. Inst. Hungary. Acad. Sci., 1960

work page 1960

[24] [24]

W. Fan, J. Li, J. Luo, Z. Tan, X. Wang, and Y . Wu. Incremental graph pattern matching. SIGMOD ’11, pages 925–936, 2011

work page 2011

[25] [25]

Gorka and R

S. Gorka and R. Philip. Improving ﬁrst-party bank fraud detection with graph databases, 2016

work page 2016

[26] [26]

J. A. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. In Proc. of RECOMB’07, 2007. 22 A PREPRINT - J UNE 28, 2019

work page 2007

[27] [27]

W.-S. Han, J. Lee, and J.-H. Lee. Turboiso: Towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proc. of SIGMOD’13, 2013

work page 2013

[28] [28]

J. He, S. Zhang, and B. He. In-cache query co-processing on coupled cpu-gpu architectures. PVLDB, 8(4):329– 340, 2014

work page 2014

[29] [29]

Herlihy and N

M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. 2008

work page 2008

[30] [30]

Inoue, M

H. Inoue, M. Ohara, and K. Taura. Faster set intersection with simd instructions by reducing branch mispredic- tions. PVLDB, 8(3):293–304, 2014

work page 2014

[31] [31]

Y . E. Ioannidis and Y . C. Kang. Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization. In SIGMOD’91, pages 168–177, 1991

work page 1991

[32] [32]

Junghanns, M

M. Junghanns, M. Kiesling, A. Averbuch, A. Petermann, and E. Rahm. Cypher-based graph pattern matching in gradoop. GRADES’17, pages 3:1–3:8

work page

[33] [33]

Kankanamge, S

C. Kankanamge, S. Sahu, A. Mhedbhi, J. Chen, and S. Salihoglu. Graphﬂow: An active graph database. SIG- MOD ’17, pages 1695–1698

work page

[34] [34]

H. Kim, J. Lee, S. S. Bhowmick, W.-S. Han, J. Lee, S. Ko, and M. H. Jarrah. Dualsim: Parallel subgraph enumeration in a massive graph on a single machine. SIGMOD ’16, pages 1231–1245, 2016

work page 2016

[35] [35]

L. Lai, L. Qin, X. Lin, and L. Chang. Scalable subgraph enumeration in mapreduce. PVLDB, 8(10):974–985, 2015

work page 2015

[36] [36]

L. Lai, L. Qin, X. Lin, and L. Chang. Scalable subgraph enumeration in mapreduce: A cost-oriented approach. The VLDB Journal, 26(3):421–446, June 2017

work page 2017

[37] [37]

L. Lai, L. Qin, X. Lin, Y . Zhang, L. Chang, and S. Yang. Scalable distributed subgraph enumeration. PVLDB, 10(3):217–228, 2016

work page 2016

[38] [38]

Y . Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: A frame- work for machine learning and data mining in the cloud. PVLDB, 5(8):716–727, 2012

work page 2012

[39] [39]

Y . Luo, W. Wang, and X. Lin. Spark: A keyword search engine on relational databases. In ICDE, pages 1552– 1555, April 2008

work page 2008

[40] [40]

Y . Luo, W. Wang, X. Lin, X. Zhou, J. Wang, and K. Li. Spark2: Top-k keyword query in relational databases. IEEE Transactions on Knowledge and Data Engineering, 23(12):1763–1780, Dec 2011

work page 2011

[41] [41]

Malewicz, M

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In Proc. of SIGMOD’10, 2010

work page 2010

[42] [42]

McSherry, M

F. McSherry, M. Isard, and D. G. Murray. Scalability! but at what cost? HOTOS’15, 2015

work page 2015

[43] [43]

D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A timely dataﬂow system. SOSP ’13, pages 439–455, 2013

work page 2013

[44] [44]

H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms. J. ACM, 65(3), 2018

work page 2018

[45] [45]

M. Qiao, H. Zhang, and H. Cheng. Subgraph matching: On compression and computation. PVLDB, 11(2):176– 188, 2017

work page 2017

[46] [46]

Ren and J

X. Ren and J. Wang. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. PVLDB, 8(5):617–628, 2015

work page 2015

[47] [47]

Shamir and D

R. Shamir and D. Tsur. Faster subtree isomorphism. In Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems, pages 126–131, 1997

work page 1997

[48] [48]

Shang, Y

H. Shang, Y . Zhang, X. Lin, and J. X. Yu. Taming veriﬁcation hardness: An efﬁcient algorithm for testing subgraph isomorphism. PVLDB, 1(1):364–375, 2008

work page 2008

[49] [49]

B. Shao, H. Wang, and Y . Li. Trinity: A distributed graph engine on a memory cloud. SIGMOD ’13, pages 505–516, 2013

work page 2013

[50] [50]

Y . Shao, B. Cui, L. Chen, L. Ma, J. Yao, and N. Xu. Parallel subgraph listing in a large-scale graph. In SIGMOD’14, pages 625–636. ACM, 2014

work page 2014

[51] [51]

Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efﬁcient subgraph matching on billion node graphs. PVLDB, 5(9):788–799, 2012

work page 2012

[52] [52]

C. H. C. Teixeira, A. J. Fonseca, M. Seraﬁni, G. Siganos, M. J. Zaki, and A. Aboulnaga. Arabesque: A system for distributed graph mining. SOSP ’15, pages 425–440. 23 A PREPRINT - J UNE 28, 2019

work page 2019

[53] [53]

H. Wei, J. X. Yu, C. Lu, and X. Lin. Speedup graph processing by graph ordering. SIGMOD ’16, pages 1813– 1828

work page

[54] [54]

Single Thread

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In HotCloud’10, pages 10–10. A Auxiliary Experiments Exp-6 Scalability of Unlabelled Matching. We vary the number of machines as 1, 2, 4, 6, 8, 10, and run the unlabelled queriesq1 andq2 to see how each strategy (B INJOIN, WO PTJOIN, S HRCUBE a...

work page 2019