Article ID: | iaor20118847 |
Volume: | 62 |
Issue: | 5 |
Start Page Number: | 2200 |
End Page Number: | 2208 |
Publication Date: | Sep 2011 |
Journal: | Computers and Mathematics with Applications |
Authors: | Yang Miin-Shen, Yu Jian, Lee E Stanley |
Keywords: | datamining, statistics: sampling |
Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is interesting to obtain the proper sample weights for clustering a data set. In this paper, we consider a probability distribution over a data set to represent its sample weights. We then apply the maximum entropy principle to automatically compute these sample weights for clustering. Such method can generate the sample‐weighted versions of most clustering algorithms, such as