Article ID: | iaor20113886 |
Volume: | 60 |
Issue: | 4 |
Start Page Number: | 719 |
End Page Number: | 743 |
Publication Date: | May 2011 |
Journal: | Computers & Industrial Engineering |
Authors: | Chong Edwin K P, Katanyukul Tatpong, Duff William S |
Keywords: | programming: dynamic, simulation: applications, learning |
This study investigates the application of learning‐based and simulation‐based Approximate Dynamic Programming (ADP) approaches to an inventory problem under the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model. Specifically, we explore the robustness of a learning‐based ADP method, Sarsa, with a GARCH(1,1) demand model, and provide empirical comparison between Sarsa and two simulation‐based ADP methods: Rollout and Hindsight Optimization (HO). Our findings assuage a concern regarding the effect of GARCH(1,1) latent state variables on learning‐based ADP and provide practical strategies to design an appropriate ADP method for inventory problems. In addition, we expose a relationship between ADP parameters and conservative behavior. Our empirical results are based on a variety of problem settings, including demand correlations, demand variances, and cost structures.