String matching with inversions and translocations in linear average time (most of the time)

0.00 Avg rating—0 Votes

Article ID:	iaor20114434
Volume:	111
Issue:	11
Start Page Number:	516
End Page Number:	520
Publication Date:	May 2011
Journal:	Information Processing Letters
Authors:	Grabowski Szymon, Faro Simone, Giaquinta Emanuele
Keywords:	matching, DNA, approximation algorithms

Abstract:

We present an efficient algorithm for finding all approximate occurrences of a given pattern p of length m in a text t of length n allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an 𝒪(nm max(α, β)) -time complexity in the worst case and 𝒪(nm max(α, β, σ)) -space complexity, where α and β are respectively the maximum length of the factors involved in any translocation and inversion, and σ is the alphabet size. Moreover we show that our algorithm has an 𝒪(n) average time complexity, whenever σ=Ω(log m/log log^1-ϵm, for ϵ>0. Experiments show that the proposed algorithm achieves very good results in practical cases.

Reviews

Required fields are marked *. Your email address will not be published.