The optimal reward operator in negative dynamic programming

0.00 Avg rating—0 Votes

Article ID:	iaor19931525
Country:	United States
Volume:	17
Issue:	4
Start Page Number:	921
End Page Number:	931
Publication Date:	Nov 1992
Journal:	Mathematics of Operations Research
Authors:	Maitra A., Sudderth W.
Keywords:	programming: markov decision

Abstract:

The authors consider the negative dynamic programming model of Strauch and prove that the optimal reward function can be obtained by a transfinite iteration of the optimal reward operator. They show that a player loses nothing by being restricted to measurable policies, if the returns from nonmeasurable policies are evaluated by lower integrals.

Reviews

Required fields are marked *. Your email address will not be published.