Independent component analysis for near‐synonym choice

Independent component analysis for near‐synonym choice

0.00 Avg rating0 Votes
Article ID: iaor20133105
Volume: 55
Issue: 1
Start Page Number: 146
End Page Number: 155
Publication Date: Apr 2013
Journal: Decision Support Systems
Authors: ,
Keywords: classification, principal component analysis, speech recognition
Abstract:

Despite their similar meanings, near‐synonyms may have different usages in different contexts, and the development of algorithms that can verify whether near‐synonyms do match their given contexts has been the focus of increasing concern. Such algorithms have many applications such as query expansion for information retrieval (IR), alternative word selection for writing support systems, and (near‐)duplicate detection for text summarization. In this paper, we propose a framework that incorporates latent semantic analysis (LSA) and independent component analysis (ICA) to automatically select suitable near‐synonyms according to the given context. LSA is used to discover useful latent features that do not frequently occur in the contexts of near‐synonyms, and ICA is used to estimate a set of independent components by minimizing the dependence between features. An SVM classifier is then trained with the independent components for best near‐synonym prediction. In experiments, we evaluate the proposed method on both Chinese and English sentences, and compare its performance to state‐of‐the‐art supervised and unsupervised methods. Experimental results show that training on the independent components that contain useful contextual features with minimized term dependence can improve the classifiers' ability to discriminate among near‐synonyms, thus yielding better performance.

Reviews

Required fields are marked *. Your email address will not be published.