The optimal reward operator in negative dynamic programming

The optimal reward operator in negative dynamic programming

0.00 Avg rating0 Votes
Article ID: iaor19931525
Country: United States
Volume: 17
Issue: 4
Start Page Number: 921
End Page Number: 931
Publication Date: Nov 1992
Journal: Mathematics of Operations Research
Authors: ,
Keywords: programming: markov decision
Abstract:

The authors consider the negative dynamic programming model of Strauch and prove that the optimal reward function can be obtained by a transfinite iteration of the optimal reward operator. They show that a player loses nothing by being restricted to measurable policies, if the returns from nonmeasurable policies are evaluated by lower integrals.

Reviews

Required fields are marked *. Your email address will not be published.