Article ID: | iaor200952617 |
Country: | United States |
Volume: | 20 |
Issue: | 2 |
Start Page Number: | 317 |
End Page Number: | 331 |
Publication Date: | Mar 2008 |
Journal: | INFORMS Journal On Computing |
Authors: | Aytug Haldun, He Ling, Koehler Gary J |
Keywords: | statistics: multivariate |
Statistical learning theory provides a formal criterion for learning a concept from examples. This theory addresses directly the trade–off in empirical fit and generalization. In practice, this leads to the structural risk–minimization principle where one minimizes a bound on the overall risk functional. For learning linear discriminant functions, this bound is impacted by the minimum of two terms—the dimension and the inverse of the margin. A popular and powerful learning mechanism, support vector machines, focuses on maximizing the margin. We compare this to methods that focus on minimizing the dimensionality, which, coincidentally, fulfills another useful criterion—the minimum description length principle.