Article ID: | iaor200443 |
Country: | United States |
Volume: | 15 |
Issue: | 2 |
Start Page Number: | 123 |
End Page Number: | 147 |
Publication Date: | Apr 2003 |
Journal: | INFORMS Journal On Computing |
Authors: | Shahabi Cyrus, Banaei-Kashani Farnoush |
Keywords: | artificial intelligence: decision support, datamining |
The World Wide Web (WWW) is the largest distributed information space and has grown to encompass diverse information resources. Although the web is growing exponentially, the individual's capacity to read and digest content is essentially fixed. The full economic potential of the web will not be realized unless enabling technologies are provided to facilitate access to web resources. Currently web personalization is the most promising approach to remedy this problem, and web mining, particularly web-usage mining, is considered a crucial component of any efficacious web-personalization system. In this paper, we describe a complete framework for web-usage mining to satisfy the challenging requirements of web-personalizaton applications. For online and anonymous web personalization to be effective, web usage mining must be accomplished in real time as accurately as possible. On the other hand, web-usage mining should allow a compromise between scalability and accuracy to be applicable to real-life websites with numerous visitors. Within our web-usage-mining framework, we introduce a distributed user-tracking approach for accurate, scalable, and implicit collection of the usage data. We also propose a new model, the feature-matrices (FM) model, to discover and interpret users' access patterns. With FM, various spatial and temporal features of usage data can be captured with flexible precision so that we can trade off accuracy for scalability based on the specific application requirements. Moreover, tunable complexity of the FM model allows real-time and adaptive access pattern discovery from usage data. We define a novel similarity measure based on FM that is specifically designed for accurate classification of partial navigation patterns in real time. Our extensive experiments with both synthetic and real data verify correctness and efficacy of our web-usage-mining framework for anonymous and efficient web personalization.