Robust Modified Policy Iteration

Robust Modified Policy Iteration

0.00 Avg rating0 Votes
Article ID: iaor20134940
Volume: 25
Issue: 3
Start Page Number: 396
End Page Number: 410
Publication Date: Jun 2013
Journal: INFORMS Journal on Computing
Authors: ,
Keywords: programming: dynamic
Abstract:

Robust dynamic programming (robust DP) mitigates the effects of ambiguity in transition probabilities on the solutions of Markov decision problems. We consider the computation of robust DP solutions for discrete‐stage, infinite‐horizon, discounted problems with finite state and action spaces. We present robust modified policy iteration (RMPI) and demonstrate its convergence. RMPI encompasses both of the previously known algorithms, robust value iteration and robust policy iteration. In addition to proposing exact RMPI, in which the ‘inner problem’ is solved precisely, we propose inexact RMPI, in which the inner problem is solved to within a specified tolerance. We also introduce new stopping criteria based on the span seminorm. Finally, we demonstrate through some numerical studies that RMPI can significantly reduce computation time.

Reviews

Required fields are marked *. Your email address will not be published.