Linear models for cost-sensitive classification

0.00 Avg rating—0 Votes

Article ID:	iaor201528857
Volume:	32
Issue:	5
Start Page Number:	622
End Page Number:	636
Publication Date:	Oct 2015
Journal:	Expert Systems
Authors:	Pendharkar Parag C
Keywords:	classification, discriminant analysis, linear systems, support vector machines, mixed integer linear programme (MILP)

Abstract:

In this paper, we investigate the performance of statistical, mathematical programming and heuristic linear models for cost‐sensitive classification. In particular, we use five cost‐sensitive techniques including Fisher's discriminant analysis (DA), asymmetric misclassification cost mixed integer programming (AMC‐MIP), cost‐sensitive support vector machine (CS‐SVM), a hybrid support vector machine and mixed integer programming (SVMIP) and heuristic cost‐sensitive genetic algorithm (CGA) techniques. Using simulated datasets of varying group overlaps, data distributions and class biases, and real‐world datasets from financial and medical domains, we compare the performances of our five techniques based on overall holdout sample misclassification cost. The results of our experiments on simulated datasets indicate that when group overlap is low and data distribution is exponential, DA appears to provide superior performance. For all other situations with simulated datasets, CS‐SVM provides superior performance. In case of real‐world datasets from financial domain, CGA and AMC‐MIP hold a slight edge over the two SVM‐based classifiers. However, for medical domains with mixed continuous and discrete attributes, SVM classifiers perform better than heuristic (CGA) and AMC‐MIP classifiers. The SVMIP model is the most computationally inefficient model and poor performing model.

Reviews

Required fields are marked *. Your email address will not be published.