Article ID: | iaor20119609 |
Volume: | 62 |
Issue: | 7 |
Start Page Number: | 2801 |
End Page Number: | 2811 |
Publication Date: | Oct 2011 |
Journal: | Computers and Mathematics with Applications |
Authors: | Liu Yang, Alham Nasullah Khalid, Li Maozhen, Hammoud Suhel |
Keywords: | statistics: regression, learning |
Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) have been used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents MRSMO, a MapReduce based distributed SVM algorithm for automatic image annotation. The performance of the MRSMO algorithm is evaluated in an experimental environment. By partitioning the training dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computers, the MRSMO algorithm reduces the training time significantly while maintaining a high level of accuracy in both binary and multiclass classifications.