Article ID: | iaor2002773 |
Country: | United Kingdom |
Volume: | 4 |
Issue: | 1 |
Start Page Number: | 53 |
End Page Number: | 59 |
Publication Date: | Jan 1992 |
Journal: | IMA Journal of Mathematics Applied in Business and Industry |
Authors: | Fogarty Terence C., Ireson Neil S., Battle Steven A. |
Keywords: | credit scoring, genetic algorithms, credit cards |
Learning a set of rules from data is basically a problem of classifying the fields of data available and combining them to give the best prediction of the goal variable. A suitable cost function for this problem is supplied by information theory in the form of information entropy. Limiting the number of classes for each field to a relatively small number, and allowing the user to define when the predictive value of a class can be considered irrelevant, can avoid the generation of a set of rules that contains a lot of irrelevant information embodied in the data. The genetic algorithm uses a technique analogous to natural evolution to search a large space of possible solutions for a near-optimum one. The search is conducted by evaluating a number of randomly generated possible solutions from the space, and then repeatedly selecting a number of pairs of these solutions with a probability proportional to their value, forming new solutions from the pairs using operators such as crossover and mutation, evaluating the new solutions, and replacing old solutions with them. Such a search requires thousands of evaluations in order to converge. To accelerate the process, a system has been built on a large multi-processing computer and will run on a Transputer-based parallel database engine. The rule-based systems are being built in collaboration with TSB Trustcard, and are an application of current research to problems of credit control.