Cascade of genetic algorithm and decision tree for cancer classification on gene expression data

0.00 Avg rating—0 Votes

Article ID:	iaor20105164
Volume:	27
Issue:	3
Start Page Number:	201
End Page Number:	218
Publication Date:	Jul 2010
Journal:	Expert Systems
Authors:	Wu Tai-Hsi, Yeh Jinn-Yi
Keywords:	heuristics: genetic algorithms

Abstract:

Cancer classification, through gene expression data analysis, has produced remarkable results, and has indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification, based on DNA array data, remains a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples, which implies that there are a large number of irrelevant genes to be dealt with. Another challenge is from the presence of noise inherent in the data set. It makes accurate classification of data more difficult when the sample size is small. We apply genetic algorithms (GAs) with an initial solution provided by t statistics, called t-GA, for selecting a group of relevant genes from cancer microarray data. The decision-tree-based cancer classifier is built on the basis of these selected genes. The performance of this approach is evaluated by comparing it to other gene selection methods using publicly available gene expression data sets. Experimental results indicate that t-GA has the best performance among the different gene selection methods. The Z-score figure also shows that some genes are consistently preferentially chosen by t-GA in each data set.

Reviews

Required fields are marked *. Your email address will not be published.