Cluster-Based Bounded Influence Regression

Cluster-Based Bounded Influence Regression

0.00 Avg rating0 Votes
Article ID: iaor201523793
Volume: 30
Issue: 1
Start Page Number: 97
End Page Number: 109
Publication Date: Feb 2014
Journal: Quality and Reliability Engineering International
Authors: , ,
Keywords: case studies, cluster analysis, Monte Carlo method
Abstract:

A regression methodology is introduced that obtains competitive, robust, efficient, high-breakdown regression parameter estimates as well as providing an informative summary regarding possible multiple outlier structure. The proposed method blends a cluster analysis phase with a controlled bounded influence (BI) regression phase, thereby referred to as cluster-based bounded influence regression, or CBI. Representing the data space via a special set of anchor points, a collection of point-addition OLS regression estimators forms the basis of a metric used in defining the similarity between any two observations. Cluster analysis then yields a main cluster ‘half-set’ of observations, with the remaining observations comprising one or more minor clusters. An initial regression estimator arises from the main cluster, with a group-additive DFFITS argument used to carefully activate the minor clusters through a BI regression frame work. CBI achieves a 50% breakdown point, is regression equivariant, scale and affine equivariant and distributionally is asymptotically normal. Case studies and Monte Carlo results demonstrate the performance advantage of CBI over other popular robust regression procedures regarding coefficient stability, scale estimation and standard errors. The dendrogram of the clustering process and the weight plot are graphical displays available for multivariate outlier detection. Overall, the proposed methodology represents advancement in the field of robust regression, offering a distinct philosophical view point towards data analysis and the marriage of estimation with diagnostic summary.

Reviews

Required fields are marked *. Your email address will not be published.