| Article ID: | iaor2005433 |
| Country: | United Kingdom |
| Volume: | 31 |
| Issue: | 14 |
| Start Page Number: | 2387 |
| End Page Number: | 2404 |
| Publication Date: | Dec 2004 |
| Journal: | Computers and Operations Research |
| Authors: | Caramia M., Felici G., Pezzoli A. |
| Keywords: | computers: information, heuristics, datamining |
The problem of obtaining relevant results in web searching has been tackled with several approaches. Although very effective techniques are currently used by the most popular search engines when no a priori knowledge on the user's desires beside the search keywords is available, in different settings it is conceivable to design search methods that operate on a thematic database of web pages that refer to a common body of knowledge or to specific sets of users. We have considered such premises to design and develop a search method that deploys data mining and optimization techniques to provide a more significant and restricted set of pages as the final result of a user search. We adopt a vectorization method based on search context and user profile to apply clustering techniques that are then refined by a specifically designed genetic algorithm. In this paper we describe the method, its implementation, the algorithms applied, and discuss some experiments that have been run on test sets of web pages.