pith. sign in

arxiv: 1012.0280 · v1 · pith:HTLK5K4Vnew · submitted 2010-12-01 · 💻 cs.DS

String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)

classification 💻 cs.DS
keywords algorithmlengthtimealphabetabigocomplexityfactors
0
0 comments X
read the original abstract

We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(\alpha, \beta))$-time complexity in the worst case and $\bigO(\max(\alpha, \beta))$-space complexity, where $\alpha$ and $\beta$ are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a $\bigO(n)$ average time complexity, whenever $\sigma = \Omega(\log m / \log\log^{1-\epsilon} m)$, where $\epsilon > 0$ and $\sigma$ is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in practical cases.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.