 
                                                                                | Article ID: | iaor200969059 | 
| Country: | United Kingdom | 
| Volume: | 60 | 
| Issue: | 8 | 
| Start Page Number: | 1069 | 
| End Page Number: | 1084 | 
| Publication Date: | Aug 2009 | 
| Journal: | Journal of the Operational Research Society | 
| Authors: | Yang J, lafsson S, Kim J | 
| Keywords: | statistics: multivariate | 
Scalability of clustering algorithms is a critical issue facing the data mining community. One method to handle this issue is to use only a subset of all instances. This paper develops an optimization-based approach to the partitional clustering problem using an algorithm specifically designed for noisy performance, which is a problem that arises when using a subset of instances. Numerical results show that computation time can be dramatically reduced by using a partial set of instances without sacrificing solution quality. In addition, these results are more persuasive as the size of the problem is larger.