Feature selection for sentiment analysis based on content and syntax models

Feature selection for sentiment analysis based on content and syntax models

0.00 Avg rating0 Votes
Article ID: iaor20125026
Volume: 53
Issue: 4
Start Page Number: 704
End Page Number: 711
Publication Date: Nov 2012
Journal: Decision Support Systems
Authors: ,
Keywords: behaviour, statistics: inference
Abstract:

Recent solutions for sentiment analysis have relied on feature selection methods ranging from lexicon‐based approaches where the set of features are generated by humans, to approaches that use general statistical measures where features are selected solely on empirical evidence. The advantage of statistical approaches is that they are fully automatic, however, they often fail to separate features that carry sentiment from those that do not. In this paper we propose a set of new feature selection schemes that use a Content and Syntax model to automatically learn a set of features in a review document by separating the entities that are being reviewed from the subjective expressions that describe those entities in terms of polarities. By focusing only on the subjective expressions and ignoring the entities, we can choose more salient features for document‐level sentiment analysis. The results obtained from using these features in a maximum entropy classifier are competitive with the state‐of‐the‐art machine learning approaches.

Reviews

Required fields are marked *. Your email address will not be published.