Article ID: | iaor2005433 |
Country: | United Kingdom |
Volume: | 31 |
Issue: | 14 |
Start Page Number: | 2387 |
End Page Number: | 2404 |
Publication Date: | Dec 2004 |
Journal: | Computers and Operations Research |
Authors: | Caramia M., Felici G., Pezzoli A. |
Keywords: | computers: information, heuristics, datamining |
The problem of obtaining relevant results in web searching has been tackled with several approaches. Although very effective techniques are currently used by the most popular search engines when no a priori knowledge on the user's desires beside the search keywords is available, in different settings it is conceivable to design search methods that operate on a thematic database of web pages that refer to a common body of knowledge or to specific sets of users. We have considered such premises to design and develop a search method that deploys data mining and optimization techniques to provide a more significant and restricted set of pages as the final result of a user search. We adopt a vectorization method based on search context and user profile to apply clustering techniques that are then refined by a specifically designed genetic algorithm. In this paper we describe the method, its implementation, the algorithms applied, and discuss some experiments that have been run on test sets of web pages.