Article ID: | iaor2017511 |
Volume: | 58 |
Issue: | 4 |
Start Page Number: | 437 |
End Page Number: | 472 |
Publication Date: | Dec 2016 |
Journal: | Australian & New Zealand Journal of Statistics |
Authors: | Fernndez D, Arnold R |
Keywords: | statistics: regression, statistics: distributions |
One of the key questions in the use of mixture models concerns the choice of the number of components most suitable for a given data set. In this paper we investigate answers to this problem in the context of likelihood‐based clustering of the rows of a matrix of ordinal data modelled by the ordered stereotype model. Two methodologies for selecting the best model are demonstrated and compared. The first approach fits a separate model to the data for each possible number of clusters, and then uses an information criterion to select the best model. The second approach uses a Bayesian construction in which the parameters and the number of clusters are estimated simultaneously from their joint posterior distribution. Simulation studies are presented which include a variety of scenarios in order to test the reliability of both approaches. Finally, the results of the application of model selection to two real data sets are shown.