Automatically extracting user reviews from forum sites

Automatically extracting user reviews from forum sites

0.00 Avg rating0 Votes
Article ID: iaor20119607
Volume: 62
Issue: 7
Start Page Number: 2779
End Page Number: 2792
Publication Date: Oct 2011
Journal: Computers and Mathematics with Applications
Authors: , ,
Keywords: social, networks, e-commerce
Abstract:

User reviews in forum sites are the important information source for many popular applications (e.g., monitoring and analysis of public opinion), which are usually represented in form of structured records. To the best of our knowledge, little existing work reported in the literature has systemically investigated the problem of extracting user reviews from forum sites. Besides the variety of web page templates, user‐generated reviews raise two new challenges. First, the inconsistency of review contents in terms of both the document object model (DOM) tree and visual appearance impair the similarity between review records; second, the review content in a review record corresponds to complicated subtrees rather than single nodes in the DOM tree. To tackle these challenges, we present WeRE – a system that performs automatic user review extraction by employing sophisticated techniques. The review records are extracted from web pages based on the proposed level‐weighted tree similarity algorithm first, and then the review contents in records are extracted exactly by measuring the node consistency. Our experimental results based on 20 forum sites indicate that WeRE can achieve high extraction accuracy.

Reviews

Required fields are marked *. Your email address will not be published.