Article ID: | iaor19891166 |
Country: | France |
Volume: | 23 |
Issue: | 2 |
Start Page Number: | 193 |
End Page Number: | 236 |
Publication Date: | May 1989 |
Journal: | RAIRO Operations Research |
Authors: | Diday E. |
The aim of this paper is to define the symbolic approach in data analysis and to show that it extends data analysis to more complex data which may be closer to the multidimensional reality. It introduces several kinds of symbolic objects (‘events’, ‘assertions’, and also ‘hordes’ and ‘synthesis’ objects) which are defined by a logical conjunction or properties concerning the variables. They can take for instance several values on a same variable and they are well adapted to the case of missing and nonsense values. Background knowledge may be represented by ‘pyramidal taxonomies’ and ‘affinities’. In clustering the problem remains to find inter-class structures such as partitions, hierarchies and pyramids on symbolic objects instead classical one. Symbolic data analysis is conducted on several principles: accuracy of the representation, coherence between the kind of objects used at input and output, knowledge predominance for driving the algorithms, self explanation of the results. The paper defines the notion of order, union and intersection between symbolic objects and shows that they are organised according to an inheritance lattice. The paper studies several kinds of qualities of symbolic objects, of classes and classification of symbolic objects. Finally it proposes several kinds of data analysis relating to the symbolic approach.