Article ID: | iaor20162583 |
Volume: | 70 |
Issue: | 3 |
Start Page Number: | 229 |
End Page Number: | 259 |
Publication Date: | Aug 2016 |
Journal: | Statistica Neerlandica |
Authors: | Cerasa Andrea |
Keywords: | statistics: regression, economics, simulation |
This article proposes three methods for merging homogeneous clusters of observations that are grouped according to a pre‐existing (known) classification. This clusterwise regression problem is at the very least compelling in analyzing international trade data, where transaction prices can be grouped according to the corresponding origin–destination combination. A proper merging of these prices could simplify the analysis of the market without affecting the representativeness of the data and highlight commercial anomalies that may hide frauds. The three algorithms proposed are based on an iterative application of the F‐test and have the advantage of being extremely flexible, as they do not require to predetermine the number of final clusters, and their output depends only on a tuning parameter. Monte Carlo results show very good performances of all the procedures, whereas the application to a couple of empirical data sets proves the practical utility of the methods proposed for reducing the dimension of the market and isolating suspicious commercial behaviors.