Article ID: | iaor20172478 |
Volume: | 29 |
Issue: | 3 |
Start Page Number: | 503 |
End Page Number: | 522 |
Publication Date: | Aug 2017 |
Journal: | INFORMS Journal on Computing |
Authors: | Ghosh Joydeep, Saar-Tsechansky Maytal, Deodhar Meghana, Keshari Vineet |
Keywords: | decision, learning, statistics: regression, information, behaviour, marketing, artificial intelligence, simulation |
Oftentimes businesses face the challenge of requiring costly information to improve the accuracy of prediction tasks. One notable example is obtaining informative customer feedback (e.g., customer‐product ratings via costly incentives) to improve the effectiveness of recommender systems. In this paper, we develop a novel active learning approach, which aims to intelligently select informative training instances to be labeled so as to maximally improve the prediction accuracy of a real‐valued prediction model. We focus on large, heterogeneous, and dyadic data, and on localized modeling techniques, which have been shown to model such data particularly well, as compared to a single, ‘global’ model. Importantly, dyadic data with covariates is pervasive in contemporary big data applications such as large‐scale recommender systems and search advertising. A key benefit from incorporating dyadic information is their simple, meaningful representation of heterogeneous data, in contrast to alternative local modeling techniques that typically produce complex and incomprehensible predictive patterns. We develop a computationally efficient active learning policy specifically tailored to exploit multiple local prediction models to identify informative acquisitions. Existing active learning policies are often computationally prohibitive for the setting we explore, and our policy makes the application of active learning computationally feasible for this setting. We present comprehensive empirical evaluations that demonstrate the benefits of our approach and explore its performance in real world, challenging domains.