Recognition: 1 theorem link
· Lean TheoremOptimal-Time Move Structure Construction
Pith reviewed 2026-05-15 00:57 UTC · model grok-4.3
The pith
A new algorithm constructs any permutation's move structure in optimal O(r) time and space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The move structure encodes a permutation as a minimal covering by O(r) disjoint intervals on the domain such that values within each interval permute together; the paper gives a construction that builds this covering in a single O(r)-time scan and constant-time adjacency accesses, achieving the information-theoretic optimum.
What carries the argument
The move structure: a covering of the permutation by O(r) intervals that permits O(1)-time evaluation of π(i) and the interval index of π(i).
If this is right
- Longest-common-prefix array can be computed from a run-length-encoded Burrows-Wheeler transform in O(n) time using only O(r) working space.
- Any compressed index that already stores a move structure can now build it in linear rather than super-linear time.
- Permutations whose run count r is much smaller than n can be navigated in compressed space without paying a logarithmic construction overhead.
Where Pith is reading between the lines
- The same linear-time interval-merging technique may extend to other run-length-compressed objects such as run-length FM-indexes or sparse suffix arrays.
- Because the construction avoids sorting or priority queues, it could be adapted to external-memory or streaming settings where only sequential passes are cheap.
Load-bearing premise
The input permutation or run-length-encoded Burrows-Wheeler transform is given in a representation that supports linear-time sequential scan and constant-time access to neighboring entries.
What would settle it
An explicit permutation on which the claimed algorithm requires more than c·r operations or more than c·r words for some small constant c.
read the original abstract
The move structure represents a permutation $\pi$ of $[0,n)$ as a covering set of $O(r)$ disjoint intervals (contiguous subsets of $[0,n)$), where $r$ is the minimum number of intervals whose values permute together. Formally, $r = 1 + |\{i\in [1,n) : \pi(i) - 1 \neq \pi(i-1)\}|$. The move structure takes $O(r)$ words of space. Given the index of the interval containing $i$, it allows computing $\pi(i)$ and the index of the interval containing $\pi(i)$ in $O(1)$-time. Therefore, for permutations where $r \ll n$, it allows their representation and navigation in significantly compressed space. The previous best $O(r)$-space move structure construction algorithm takes $O(r\log r)$-time. In this paper, we describe a construction algorithm achieving optimal $O(r)$-time and space. We also show that using our improved algorithm within a recent previous work allows the computation of the longest common prefix array in $O(r)$-working space and optimal $O(n)$-time given the run-length-encoded Burrows-Wheeler transform. Finally, we implement our improved move structure construction algorithm and find that it is faster than the previous best algorithm while using comparable memory.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims an optimal O(r)-time and O(r)-space algorithm for constructing the move structure of a permutation π (represented as O(r) disjoint intervals where r is the number of runs), improving on the prior O(r log r) construction. It further shows that embedding this construction yields an O(r)-space, O(n)-time algorithm for the LCP array given the run-length-encoded BWT, and reports a practical implementation that is faster than the previous best while using comparable memory.
Significance. If correct, the result supplies the first optimal-time construction for a compressed permutation representation supporting O(1)-time navigation and interval indexing. The LCP application is a concrete, useful consequence for string algorithms operating on run-length compressed data. The authors also supply an implementation whose empirical speed-up provides reproducible evidence that the asymptotic improvement translates to practice.
major comments (2)
- [§3] §3 (Algorithm description): the O(r) bound is stated to follow from a constant number of linear scans over the r runs, but the text does not explicitly verify that the input representation (explicit sequence of r runs) permits O(1) access to adjacent run values without hidden logarithmic factors in the RAM model; this assumption is load-bearing for the optimality claim.
- [§4] §4 (LCP application): the reduction to O(n) time and O(r) space is sketched by invoking the new move-structure construction inside a prior framework, but no concrete recurrence or space accounting is supplied showing that the working space remains strictly O(r) throughout the LCP computation.
minor comments (2)
- [Abstract] The formal definition of r in the abstract is correct but would benefit from a one-sentence example illustrating how the count of value discontinuities produces the run count.
- [Experiments] Experimental tables report wall-clock times but omit the precise machine model, word size, and input alphabet sizes used for the RLBWT instances.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the two constructive comments on the presentation. Both points can be addressed with minor clarifications that strengthen the manuscript without altering its technical content. We outline our responses below and will incorporate the suggested additions in the revised version.
read point-by-point responses
-
Referee: [§3] §3 (Algorithm description): the O(r) bound is stated to follow from a constant number of linear scans over the r runs, but the text does not explicitly verify that the input representation (explicit sequence of r runs) permits O(1) access to adjacent run values without hidden logarithmic factors in the RAM model; this assumption is load-bearing for the optimality claim.
Authors: We agree that an explicit statement of the model would remove any ambiguity. In the standard word-RAM model (word size Θ(log n)), the input is an explicit array of r runs; each array access, including to adjacent runs, is O(1) time. Consequently every linear scan costs O(r) time with no hidden logarithmic factors. We will insert a short paragraph at the beginning of §3 stating the model and confirming that all accesses are direct array accesses. revision: yes
-
Referee: [§4] §4 (LCP application): the reduction to O(n) time and O(r) space is sketched by invoking the new move-structure construction inside a prior framework, but no concrete recurrence or space accounting is supplied showing that the working space remains strictly O(r) throughout the LCP computation.
Authors: We accept that a more explicit space analysis improves clarity. The prior framework already maintains O(r) space for the RLBWT and auxiliary arrays; our O(r)-time move-structure construction replaces the previous O(r log r) subroutine and uses only O(r) additional working space that is released after construction. We will add a dedicated paragraph in §4 that gives the space recurrence T(n,r) = O(r) + T'(n,r) where T' denotes the space of the prior framework, confirming that the total working space never exceeds O(r). revision: yes
Circularity Check
No significant circularity in algorithmic derivation
full rationale
The paper describes a direct algorithmic construction for the move structure that performs a constant number of linear scans over the input's r runs using only array indexing and arithmetic under the RAM model. No equations define outputs in terms of themselves, no parameters are fitted to subsets of data and then relabeled as predictions, and no load-bearing self-citations or uniqueness theorems are invoked to justify the O(r) bound. The running-time claim follows immediately from the input representation permitting O(1) access per run and O(r) total work, which is independent of the constructed output structure. This is a standard self-contained algorithmic argument with no reduction to its own results by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard word-RAM model with constant-time word operations and linear-time input scanning
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present an optimal-time and space move structure construction algorithm which satisfies the balancing criterion... using linked lists... balancing π and π⁻¹ simultaneously
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Faster Iterative $\phi$ Queries on the Positional BWT
A refined segments decomposition yields two space-time tradeoffs for k iterative φ queries on the PBWT, improving on prior O((r̃ + h) log n) space and O(k log log m) time.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.