Article ID: | iaor20127360 |
Volume: | 54 |
Issue: | 1 |
Start Page Number: | 390 |
End Page Number: | 401 |
Publication Date: | Dec 2012 |
Journal: | Decision Support Systems |
Authors: | Zhang Min, Liu Yiqun, Xue Yufei, Xu Danqing, Cen Rongwei, Ma Shaoping, Ru Liyun |
Keywords: | graphs |
Page quality estimation is one of the greatest challenges for Web search engines. Hyperlink analysis algorithms such as PageRank and TrustRank are usually adopted for this task. However, low quality, unreliable and even spam data in the Web hyperlink graph makes it increasingly difficult to estimate page quality effectively. Analyzing large‐scale user browsing behavior logs, we found that a more reliable Web graph can be constructed by incorporating browsing behavior information. The experimental results show that hyperlink graphs constructed with the proposed methods are much smaller in size than the original graph. In addition, algorithms based on the proposed ‘surfing with prior knowledge’ model obtain better estimation results with these graphs for both high quality page and spam page identification tasks. Hyperlink graphs constructed with the proposed methods evaluate Web page quality more precisely and with less computational effort.