Article ID: | iaor20072519 |
Country: | India |
Volume: | 4 |
Issue: | J06 |
Start Page Number: | 1 |
End Page Number: | 22 |
Publication Date: | Jun 2006 |
Journal: | International Journal of Applied Mathematics & Statistics (IJAMAS) |
Authors: | Stainvas Inna, Intrator Nathan |
Classification and recognition of high-dimensional data is difficult due to the ‘curse of dimensionality’ problem, i.e. insufficient data to robustly train an estimator. The problem may be overcome by dimensionality reduction. Many statistical models, such as linear discriminant analysis (LDA) and neural networks (NNs), for example, include dimensionality reduction as an implicit preprocessing step. However, such projection onto discriminant directions is not sufficient since the number of direction parameters still remains large (proportional to dimensionality of the data); and models persist to be many parameter models and require regularization. In this work, we propose to regularize the low-dimensional structure of the projection parameter space based on compression concepts. We assume that an intrinsic dimensionality of the discriminant space spanned by projection directions is essentially small and the latter may be sufficiently well represented as a linear superposition of a small number of wavelet functions in the wavelet packet basis. We further introduce a simple incremental way to increase the dimensionality of the parameter space using hypothesis testing and apply the technique to logistic regression and to Fisher linear discrimination. Three benchmark data-sets, triangular waveforms, the vowel data-set (CMU repository) and a letter data set (DELVE), are used to demonstrate the proposed method. We show that this approach leads to significant classification improvement.