A framework for the evaluation of session reconstruction heuristics in Web-usage analysis

A framework for the evaluation of session reconstruction heuristics in Web-usage analysis

0.00 Avg rating0 Votes
Article ID: iaor2004434
Country: United States
Volume: 15
Issue: 2
Start Page Number: 171
End Page Number: 190
Publication Date: Apr 2003
Journal: INFORMS Journal On Computing
Authors: , , ,
Keywords: performance, datamining
Abstract:

Web-usage mining has become the subject of intensive research, as its potential for personalized services, adaptive Web sites and customer profiling is recognized. However, the reliability of Web-usage mining results depends heavily on the proper preparation of the input datasets. In particular, errors in the reconstruction of sessions and incomplete tracing of users' activities in a site can easily result in invalid patterns and wrong conclusions. In this study, we evaluate the performance of heuristics employed to reconstruct sessions from the server log data. Such heuristics are called to partition activities first by users and then by visit of the user in the site, where user identification mechanisms, such as cookies, may or may not be available. We propose a set of performance measures that are sensitive to two types of reconstruction errors and appropriate for different applications in knowledge discovery (KDD) applications. We have tested our framework on the Web server data of a frame-based Web site. The first experiment concerned a specific KDD application and has shown the sensitivity of the heuristics to particulars of the site's structure and traffic. The second experiment is not bound to a specific application but rather compares the performance of the heuristics for different measures and thus for different application types. Our results show that there is no single best heuristic, but our measures help the analyst in the selection of the heuristic best suited for the application at hand.

Reviews

Required fields are marked *. Your email address will not be published.