Logic based methods for SNPs tagging and reconstruction

Logic based methods for SNPs tagging and reconstruction

0.00 Avg rating0 Votes
Article ID: iaor2010860
Volume: 37
Issue: 8
Start Page Number: 1419
End Page Number: 1426
Publication Date: Aug 2010
Journal: Computers and Operations Research
Authors: , ,
Keywords: DNA
Abstract:

SNPs are positions of the DNA sequences where the differences among individuals are embedded. The knowledge of such SNPs is crucial for disease association studies, but even if the number of such positions is low (about 1% of the entire sequence), the cost to extract the complete information is actually very high. Recent studies have shown that DNA sequences are structured into blocks of positions, that are conserved during evolution, where there is strong correlation among values (alleles) of different loci. To reduce the cost of extracting SNPs information, the block structure of the DNA has suggested to limit the process to a subset of SNPs, the so-called Tag SNPs, that are able to maintain the most of the information contained in the whole sequence. In this paper, we apply a technique for feature selection based on integer programming to the problem of Tag SNP selection. Moreover, to test the quality of our approach, we consider also the problem of SNPs reconstruction, i.e. the problem of deriving unknown SNPs from the value of Tag SNPs and propose two reconstruction methods, one based on a majority vote and the other on a machine learning approach. We test our algorithm on two public data sets of different nature, providing results that are, when comparable, in line with the related literature. One of the interesting aspects of the proposed method is to be found in its capability to deal simultaneously with very large SNPs sets, and, in addition, to provide highly informative reconstruction rules in the form of logic formulas.

Reviews

Required fields are marked *. Your email address will not be published.