Article ID: | iaor2008838 |
Country: | Netherlands |
Volume: | 42 |
Issue: | 3 |
Start Page Number: | 1338 |
End Page Number: | 1349 |
Publication Date: | Dec 2006 |
Journal: | Decision Support Systems |
Authors: | Fan Weiguo, Wallace Linda, Pathak Praveen |
Keywords: | artificial intelligence: decision support, heuristics: genetic algorithms |
Ranking function is instrumental in affecting the performance of a search engine. Designing and optimizing a search engine's ranking function remains a daunting task for computer and information scientists. Recently, genetic programming (GP), a machine learning technique based on evolutionary theory, has shown promise in tackling this very difficult problem. Ranking functions discovered by GP have been found to be significantly better than many of the other existing ranking functions. However, current GP implementations for ranking function discovery are all designed utilizing the Vector Space model in which the same term weighting strategy is applied to all terms in a document. This may not be an ideal representation scheme at the individual query level considering the fact that many query terms should play different roles in the final ranking. In this paper, we propose a novel nonlinear ranking function representation scheme and compare this new design to the well-known Vector Space model. We theoretically show that the new representation scheme subsumes the traditional Vector Space model representation scheme as a special case and hence allows for additional flexibility in term weighting. We test the new representation scheme with the GP-based discovery framework in a personalized search (information routing) context using a TREC web corpus. The experimental results show that the new ranking function representation design outperforms the traditional Vector Space model for GP-based ranking function discovery.