Article ID: | iaor201523782 |
Volume: | 28 |
Issue: | 5 |
Start Page Number: | 508 |
End Page Number: | 523 |
Publication Date: | Jul 2012 |
Journal: | Quality and Reliability Engineering International |
Authors: | Saraiva Pedro M, Reis Marco S, Gomes Vronique M, Pereira Ana C |
Keywords: | statistics: regression |
A systematization of complex data structures, originated in current processes and product analysis activities, is presented as a starting point for developing generalized and flexible frameworks for handling such challenging information sources. In this article, we present an abstract and unifying definition for a class of multidimensional data arrays (here called profiles), built upon which a taxonomy is proposed for their classification, according to aspects relevant for the development of software platforms, namely, the underlying data structure characteristics and the nature of information they convey. Such taxonomy is based on an extensive bibliographic review involving the analysis of complex datasets. Then, we identify the classes of profiles with higher demand (dominant classes) and, for these, conduct a comparison study involving those methodologies that could be applied in the context of two tasks: calibration/regression and process monitoring. The comparison study showed that the Tucker3 and N‐way Partial Least Squares methods are good candidates to incorporate a computational framework addressing the dominant classes of profiles that were identified. The purpose of this ongoing work is to provide fundamental information for developing new software tools able to handle a broad scope of problems, involving the type of complex datasets found in practice, as well as on how to modularly add features regarding the treatment of other, less frequent, classes, to add value to users with more specific interests.