Development of Generalized Platforms for the Analysis of Complex Datasets

0.00 Avg rating—0 Votes

Article ID:	iaor201523782
Volume:	28
Issue:	5
Start Page Number:	508
End Page Number:	523
Publication Date:	Jul 2012
Journal:	Quality and Reliability Engineering International
Authors:	Saraiva Pedro M, Reis Marco S, Gomes Vronique M, Pereira Ana C
Keywords:	statistics: regression

Abstract:

A systematization of complex data structures, originated in current processes and product analysis activities, is presented as a starting point for developing generalized and flexible frameworks for handling such challenging information sources. In this article, we present an abstract and unifying definition for a class of multidimensional data arrays (here called profiles), built upon which a taxonomy is proposed for their classification, according to aspects relevant for the development of software platforms, namely, the underlying data structure characteristics and the nature of information they convey. Such taxonomy is based on an extensive bibliographic review involving the analysis of complex datasets. Then, we identify the classes of profiles with higher demand (dominant classes) and, for these, conduct a comparison study involving those methodologies that could be applied in the context of two tasks: calibration/regression and process monitoring. The comparison study showed that the Tucker3 and N‐way Partial Least Squares methods are good candidates to incorporate a computational framework addressing the dominant classes of profiles that were identified. The purpose of this ongoing work is to provide fundamental information for developing new software tools able to handle a broad scope of problems, involving the type of complex datasets found in practice, as well as on how to modularly add features regarding the treatment of other, less frequent, classes, to add value to users with more specific interests.

Reviews

Required fields are marked *. Your email address will not be published.