Article ID: | iaor20108531 |
Volume: | 208 |
Issue: | 2 |
Start Page Number: | 142 |
End Page Number: | 152 |
Publication Date: | Jan 2011 |
Journal: | European Journal of Operational Research |
Authors: | Blazewicz Jacek, Kasprzak Marta, Burke Edmund K, Kovalyov Mikhail Y, Kovalev Alexandr |
Keywords: | programming: linear |
The goal of the simplified partial digest problem (SPDP) is motivated by the reconstruction of the linear structure of a DNA chain with respect to a given nucleotide pattern, based on the multiset of distances between the adjacent patterns (interpoint distances) and the multiset of distances between each pattern and the two unlabeled endpoints of the DNA chain (end distances). We consider optimization versions of the problem, called SPDP-Min and SPDP-Max. The aim of SPDP-Min (SPDP-Max) is to find a DNA linear structure with the same multiset of end distances and the minimum (maximum) number of incorrect (correct) interpoint distances. Results are presented on the worst-case efficiency of approximation algorithms for these problems. We suggest a graph-theoretic model for SPDP-Min and SPDP-Max, which can be used to reduce the search space for an optimal solution in either of these problems. We also present heuristic polynomial time algorithms based on this model. In computational experiments with randomly generated and real-life input data, our best algorithm delivered an optimal solution in 100% of the instances for a number of restriction sites not greater than 50.