Parametric optimization of sequence alignment

0.00 Avg rating—0 Votes

Article ID:	iaor1995979
Country:	United States
Volume:	12
Issue:	4/5
Start Page Number:	312
End Page Number:	326
Publication Date:	Jul 1994
Journal:	Algorithmica
Authors:	Gusfield D., Naor D., Balasubramanian K.
Keywords:	programming: dynamic

Abstract:

The optimal alignment or the weighted minimum edit distance between two DNA or amino acid sequences for a given set of weights is computed by classical dynamic programming techniques, and is widely used in molecular biology. However, in DNA and amino acid sequences there is considerable disagreement about how to weight matches, mismatches, insertions/deletions (indels or spaces), and gaps. Parametric sequence alignment is the problem of computing the optimal-valued alignment between two sequences as a function of variable weights for matches, mismatches, spaces, and gaps. The goal is to partition the parameter space into regions (which are necessarily convex) such that in each region one alignment is optimal throughout and such that the regions are maximal for this property. In this paper the authors are primarily concerned with the structure of this convex decomposition, and secondarily with the complexity of computing the decomposition. The most striking results are the following: For the special case where only matches, mismatches, and spaces are counted, and where spaces are counted throughout the alignment, the authors show that the decomposition is surprisingly simple: all regions are infinite; there are at most n²/3 regions; the lines that bound the regions are all of the form β=c+(c+0.5)α; and the entire decomposition can be found in 0(knm) time, where k is the actual number of regions, and n>m are the lengths of the two strings. These results were found while implementing a large software package for parametric sequence analysis, and in turn have led to faster algorithms for those tasks.

Reviews

Required fields are marked *. Your email address will not be published.