Article ID: | iaor200969051 |
Country: | United Kingdom |
Volume: | 60 |
Issue: | 8 |
Start Page Number: | 1123 |
End Page Number: | 1134 |
Publication Date: | Aug 2009 |
Journal: | Journal of the Operational Research Society |
Authors: | Pendharkar P |
Keywords: | neural networks, heuristics: genetic algorithms |
We study three different approaches to formulate a misclassification cost minimizing genetic algorithm (GA) fitness function for a GA-neural network classifier. These three different approaches include a fitness function that directly minimizes total misclassification cost, a fitness function that uses posterior probability for minimizing total misclassification cost and a hybrid fitness function that uses an average value of the first two fitness functions to minimize total misclassification cost. Using simulated data sets representing three different distributions and four different misclassification cost matrices, we test the performance of the three fitness functions on a two-group classification problem. Our results indicate that the posterior probability-based misclassification cost minimizing function and the hybrid fitness function are less prone to training data over fitting, but direct misclassification cost minimizing fitness function provides the lowest overall misclassification cost in training tests. For holdout sample tests, when cost asymmetries are low (less than or equal to a ratio of 1:2), the hybrid misclassification cost minimizing fitness function yields the best results; however, when cost asymmetries are high (equal or greater than a ratio of 1:4), the total misclassification cost minimizing function provides the best results. We validate our findings using a real-world data on a bankruptcy prediction problem.