pith. machine review for the scientific record. sign in

arxiv: 2602.13735 · v2 · submitted 2026-02-14 · 💻 cs.DS

Recognition: unknown

Compressed Index with Construction in Compressed Space

Dmitry Kosolobov

Authors on Pith no claims yet
classification 💻 cs.DS
keywords deltaspacefracindextimestringalphaconstruction
0
0 comments X
read the original abstract

Suppose that we are given a string $s$ of length $n$ over an alphabet $\{0,1,\ldots,n^{O(1)}\}$ and $\delta$ is the string complexity of $s$, a known compression measure. We describe an index on $s$ with $O(\delta\log\frac{n}{\delta})$ space, measured in $O(\log n)$-bit machine words, which can search in $s$ any string of length $m$ in $O(m + (\mathrm{occ} + 1)\log^\epsilon n)$ time, where $\mathrm{occ}$ is the number of occurrences and $\epsilon > 0$ is any fixed constant (the big-O in the space bound hides factor $\frac{1}{\epsilon}$). Crucially, the index can be built in $O(n\log n)$ expected time by one left-to-right pass on the string $s$ in a streaming fashion with $O(\delta\log\frac{n}{\delta})$ construction space. The index does not use the Karp--Rabin fingerprints, and the randomization in the construction time can be eliminated by using deterministic dictionaries instead of hash tables (with a slowdown). The search time matches currently best results and the space is almost optimal (the known optimum is $O(\delta\log\frac{n}{\delta\alpha})$, where $\alpha = \log_\sigma n$ and $\sigma$ is the alphabet size, and it coincides with $O(\delta\log\frac{n}{\delta})$ when $\delta = O(n / \alpha^2)$). This is the first index that can be constructed within such space and with such time guarantees. To avoid uninteresting marginal cases, all above bounds are stated for $\delta \ge \Omega(\log\log n)$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dynamic Grammar-Compressed Self-Index in $\delta$-Optimal Space

    cs.DS 2026-04 unverdicted novelty 8.0

    The dynamic RR-index is the first dynamic self-index to attain δ-optimal space, with locate in expected O(m + log m log² n + occ (log n / log log n)) time and updates in O(m' log² n + log³ n) time.