Recognition: unknown
The Communication Complexity of Pattern Matching with Edits Revisited
Pith reviewed 2026-05-10 08:24 UTC · model grok-4.3
The pith
The one-way communication complexity of reporting k-edit occurrences (including the edit sequences) is Θ(n/m · k log(m|Σ|/k)) bits for 0 < k < m < n/2.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We improve the encoding size to O(n/m · k log(m|Σ|/k)), which matches the lower bound for constant-sized alphabets. We further establish a new tight lower bound of Ω(n/m · k log(m|Σ|/k)) for the edit sequence reporting variant.
Load-bearing premise
The analysis assumes the standard one-way communication model and the parameter regime 0 < k < m < n/2; the lower-bound construction may not extend outside this regime or to two-way communication.
Figures
read the original abstract
In the decades-old Pattern Matching with Edits problem, given a length-$n$ string $T$ (the text), a length-$m$ string $P$ (the pattern), and a positive integer $k$ (the threshold), the task is to list the $k$-error occurrences of $P$ in $T$, that is, all fragments of $T$ whose edit distance to $P$ is at most $k$. The one-way communication complexity of Pattern Matching with Edits is the minimum number of bits that Alice, given an instance $(P, T, k)$ of the problem, must send to Bob so that Bob can reconstruct the answer solely from that message. For the natural parameter regime of $0 < k < m < n/2$, our recent work [STOC'24] yields that ${\Omega}(n/m \cdot k \log(m/k))$ bits are necessary and $O(n/m \cdot k \log^2 m)$ bits are sufficient for Pattern Matching with Edits. More generally, for strings over an alphabet ${\Sigma}$, our recent work [STOC'24] gives an $O(n/m \cdot k \log m \log(m|{\Sigma}|))$-bit encoding that allows one to recover a shortest sequence of edits for every $k$-error occurrence of $P$ in $T$. In this work, we revisit the original proof and improve the encoding size to $O(n/m \cdot k \log(m|{\Sigma}|/k))$, which matches the lower bound for constant-sized alphabets. We further establish a new tight lower bound of ${\Omega}(n/m \cdot k \log(m|{\Sigma}|/k))$ for the edit sequence reporting variant that we solve. Our encoding size also matches the communication complexity established for the simpler Pattern Matching with Mismatches problem in the context of streaming algorithms [Clifford, Kociumaka, Porat; SODA'19].
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper revisits the one-way communication complexity of Pattern Matching with Edits. For 0 < k < m < n/2, it improves the encoding size from O(n/m · k log m log(m|Σ|)) to O(n/m · k log(m|Σ|/k)) bits, allowing recovery of shortest edit sequences for all k-error occurrences of P in T. It also proves a matching Ω(n/m · k log(m|Σ|/k)) lower bound for the edit sequence reporting variant, achieving tightness for constant-sized alphabets and matching the known complexity of the mismatches variant.
Significance. If correct, the result tightens the communication complexity of a core string problem by removing an extraneous log factor from the prior STOC'24 upper bound while establishing a new tight lower bound for sequence reporting. The explicit encoding construction and reduction-based lower bound provide a clean, parameter-improved characterization that aligns with the mismatches case from SODA'19.
major comments (1)
- Abstract and §1: the upper-bound construction is claimed to solve the edit-sequence-reporting variant, yet the lower bound is stated only for that variant; the manuscript must explicitly confirm that the O(n/m · k log(m|Σ|/k)) encoding outputs the sequences (not merely their existence or positions) so that the matching claim is load-bearing.
minor comments (2)
- Abstract: the regime statement '0 < k < m < n/2' is repeated but the dependence on |Σ| in the new bound should be highlighted earlier to clarify the constant-alphabet tightness.
- Notation: ensure that log(m|Σ|/k) is written unambiguously (parentheses) in all equations and that the base of the logarithm is stated once.
Simulated Author's Rebuttal
We thank the referee for the careful reading of the manuscript and the positive recommendation for minor revision. We address the single major comment below.
read point-by-point responses
-
Referee: [—] Abstract and §1: the upper-bound construction is claimed to solve the edit-sequence-reporting variant, yet the lower bound is stated only for that variant; the manuscript must explicitly confirm that the O(n/m · k log(m|Σ|/k)) encoding outputs the sequences (not merely their existence or positions) so that the matching claim is load-bearing.
Authors: We agree that an explicit statement would strengthen clarity. The improved encoding we present is a direct refinement of the STOC'24 construction, which (as already noted in the manuscript) outputs a shortest edit sequence for every k-error occurrence rather than merely the positions or existence of those occurrences. The new lower bound is proved specifically for this sequence-reporting variant. We will revise the abstract and the first paragraph of §1 to state explicitly that the O(n/m · k log(m|Σ|/k))-bit message enables recovery of the edit sequences themselves, thereby making the tightness claim load-bearing for the reporting variant. revision: yes
Circularity Check
Minor self-citation to prior STOC'24 work; new upper/lower bounds are explicit constructions and reductions.
full rationale
The paper improves the encoding via an explicit construction to O(n/m · k log(m|Σ|/k)) and proves a new tight lower bound Ω(n/m · k log(m|Σ|/k)) for edit sequence reporting via reduction from a communication problem. These steps do not reduce to fitted parameters or self-definitions within the paper. The citation to the authors' own STOC'24 result is used only for context on prior (weaker) bounds and is not load-bearing for the new claims, which stand independently as constructions and reductions in the standard one-way model.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Edit distance is the minimum number of insertions, deletions, and substitutions needed to transform one string into another.
- domain assumption One-way communication complexity is the minimum bits Alice must send to Bob so Bob can output the answer.
Reference graph
Works this paper leans on
-
[1]
write newline
" write newline "" before.all 'output.state := FUNCTION fin.entry.original add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 i...
-
[2]
Sign-rank of k - H amming distance is constant
Panagiotis Charalampopoulos, Tomasz Kociumaka, and Philip Wellnitz. Pattern matching under weighted edit distance. FOCS , 2025. https://arxiv.org/abs/2510.17752 arXiv:2510.17752 , https://doi.org/10.1109/FOCS63196.2025.00102 doi:10.1109/FOCS63196.2025.00102
-
[3]
Near-optimal-time quantum algorithms for approximate pattern matching
Tomasz Kociumaka, Jakob Nogler, and Philip Wellnitz. Near-optimal-time quantum algorithms for approximate pattern matching. SODA , 2025. https://arxiv.org/abs/2410.06808 arXiv:2410.06808 , https://doi.org/10.1137/1.9781611978322.15 doi:10.1137/1.9781611978322.15
-
[4]
Daniel Gibney, Ce Jin, Tomasz Kociumaka, and Sharma V. Thankachan. Near-optimal quantum algorithms for bounded edit distance and Lempel-Ziv factorization. SODA , 2024. https://arxiv.org/abs/2311.01793 arXiv:2311.01793 , https://doi.org/10.1137/1.9781611977912.11 doi:10.1137/1.9781611977912.11
-
[5]
Communication lower bounds for collision problems via density increment arguments
Tomasz Kociumaka, Jakob Nogler, and Philip Wellnitz. On the communication complexity of approximate pattern matching. STOC , 2024. https://arxiv.org/abs/2403.18812 arXiv:2403.18812 , https://doi.org/10.1145/3618260.3649604 doi:10.1145/3618260.3649604
-
[6]
Almost linear size edit distance sketch
Michal Kouck \' y and Michael Saks. Almost linear size edit distance sketch. STOC , 2024. https://arxiv.org/abs/2406.11225 arXiv:2406.11225 , https://doi.org/10.1145/3618260.3649783 doi:10.1145/3618260.3649783
-
[7]
Streaming k -edit approximate pattern matching via string decomposition
Sudatta Bhattacharya and Michal Kouck \' y . Streaming k -edit approximate pattern matching via string decomposition. ICALP , 2023. https://arxiv.org/abs/2305.00615 arXiv:2305.00615 , https://doi.org/10.4230/LIPIcs.ICALP.2023.22 doi:10.4230/LIPIcs.ICALP.2023.22
-
[8]
and Li, Jerry and Liu, Allen and Narayanan, Shyam , booktitle =
Alejandro Cassis, Tomasz Kociumaka, and Philip Wellnitz. Optimal algorithms for bounded weighted edit distance. FOCS , 2023. https://arxiv.org/abs/2305.06659 arXiv:2305.06659 , https://doi.org/10.1109/FOCS57990.2023.00135 doi:10.1109/FOCS57990.2023.00135
-
[9]
Kim, Vinod Vaikuntanathan, and Or Zamir
Panagiotis Charalampopoulos, Tomasz Kociumaka, and Philip Wellnitz. Faster pattern matching under edit distance: A reduction to dynamic puzzle matching and the seaweed monoid of permutation matrices. FOCS , 2022. https://arxiv.org/abs/2204.03087 arXiv:2204.03087 , https://doi.org/10.1109/FOCS54457.2022.00072 doi:10.1109/FOCS54457.2022.00072
-
[10]
Covering polygons is even harder
Tomasz Kociumaka, Ely Porat, and Tatiana Starikovskaya. Small-space and streaming pattern matching with k edits. FOCS , 2021. https://arxiv.org/abs/2106.06037 arXiv:2106.06037 , https://doi.org/10.1109/FOCS52979.2021.00090 doi:10.1109/FOCS52979.2021.00090
-
[11]
Monochromatic Triangles, Triangle Listing and
Panagiotis Charalampopoulos, Tomasz Kociumaka, and Philip Wellnitz. Faster approximate pattern matching: A unified approach. FOCS , 2020. https://arxiv.org/abs/2004.08350 arXiv:2004.08350 , https://doi.org/10.1109/FOCS46700.2020.00095 doi:10.1109/FOCS46700.2020.00095
-
[12]
The streaming k -mismatch problem
Rapha \" e l Clifford, Tomasz Kociumaka, and Ely Porat. The streaming k -mismatch problem. SODA , 2019. https://arxiv.org/abs/1708.05223 arXiv:1708.05223 , https://doi.org/10.1137/1.9781611975482.68 doi:10.1137/1.9781611975482.68
-
[13]
Edit distance cannot be computed in strongly subquadratic time (unless SETH is false)
Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM Journal on Computing , 2018. https://arxiv.org/abs/1412.0348 arXiv:1412.0348 , https://doi.org/10.1137/15M1053128 doi:10.1137/15M1053128
-
[14]
Communication and streaming complexity of approximate pattern matching
Tatiana Starikovskaya. Communication and streaming complexity of approximate pattern matching. CPM , 2017. https://doi.org/10.4230/LIPIcs.CPM.2017.13 doi:10.4230/LIPIcs.CPM.2017.13
-
[15]
Landau, Rajeev Raman, Kunihiko Sadakane, Srinivasa Rao Satti, and Oren Weimann
Philip Bille, Gad M. Landau, Rajeev Raman, Kunihiko Sadakane, Srinivasa Rao Satti, and Oren Weimann. Random access to grammar-compressed strings and trees. SIAM Journal on Computing , 2015. https://arxiv.org/abs/1001.1565 arXiv:1001.1565 , https://doi.org/10.1137/130936889 doi:10.1137/130936889
-
[16]
Threshold approximate matching in grammar-compressed strings
Alexander Tiskin. Threshold approximate matching in grammar-compressed strings. PSC , 2014
2014
-
[17]
Beating O (nm) in approximate LZW -compressed pattern matching
Paweł Gawrychowski and Damian Straszak. Beating O (nm) in approximate LZW -compressed pattern matching. ISAAC , 2013. https://arxiv.org/abs/1308.6509 arXiv:1308.6509 , https://doi.org/10.1007/978-3-642-45030-3_8 doi:10.1007/978-3-642-45030-3_8
-
[18]
Approximate string matching: A simpler faster algorithm
Richard Cole and Ramesh Hariharan. Approximate string matching: A simpler faster algorithm. SIAM Journal on Computing , 2002. https://doi.org/10.1137/S0097539700370527 doi:10.1137/S0097539700370527
-
[19]
Gad M. Landau and Uzi Vishkin. Fast parallel and serial approximate string matching. Journal of Algorithms , 1989. https://doi.org/10.1016/0196-6774(89)90010-2 doi:10.1016/0196-6774(89)90010-2
-
[20]
Gad M. Landau and Uzi Vishkin. Fast string matching with k differences. Journal of Computer and System Sciences , 1988. https://doi.org/10.1016/0022-0000(88)90045-1 doi:10.1016/0022-0000(88)90045-1
-
[21]
Peter H. Sellers. The theory and computation of evolutionary distances: Pattern recognition. Journal of Algorithms , 1980. https://doi.org/10.1016/0196-6774(80)90016-4 doi:10.1016/0196-6774(80)90016-4
-
[22]
A universal algorithm for sequential data compression.IEEE Trans
Jacob Ziv and Abraham Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory , 1977. https://doi.org/10.1109/TIT.1977.1055714 doi:10.1109/TIT.1977.1055714
-
[23]
Nathan J. Fine and Herbert S. Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society , 1965. https://doi.org/10.1090/S0002-9939-1965-0174934-9 doi:10.1090/S0002-9939-1965-0174934-9
-
[24]
Binary codes capable of correcting deletions, insertions and reversals
Vladimir Iosifovich Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR , 1965
1965
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.