Article ID: | iaor20072631 |
Country: | Netherlands |
Volume: | 42 |
Issue: | 1 |
Start Page Number: | 144 |
End Page Number: | 161 |
Publication Date: | Oct 2006 |
Journal: | Decision Support Systems |
Authors: | Krishnan Ramayya, Caulkins Jonathan P., Duncan George, Ding Wenxuan, Nyberg Eric |
Keywords: | internet |
Various entities (e.g., parents, employers) that provide users (e.g., children, employees) access to web content wish to limit the content accessed through those computers. Available filtering methods are crude in that they too often block ‘acceptable’ content while failing to block ‘unacceptable’ content. This paper presents a general and flexible classification method based on statistical techniques applied to text material, that we call Filtering by Statistical Classification (FSC). According to each individual entity's expressed opinions about what content in a training data set is or is not acceptable, FSC constructs a customized model to represent each individual entity's preferences. FSC then uses this customized model to examine new web content and to block unwanted content. The empirical results suggest that our method has greater predictive power than do a variety of existing approaches.