A note on maximal mean/standard deviation ratio in an undiscounted MDP

0.00 Avg rating—0 Votes

Article ID:	iaor1989725
Country:	Netherlands
Volume:	8
Start Page Number:	201
End Page Number:	203
Publication Date:	Apr 1989
Journal:	Operations Research Letters
Authors:	Chung Kun-Jen

Abstract:

A stationary policy in an MDP (Markov decision process) induces a stationary probability distribution of the reward from each initial state. This note is related to the problem of maximizing the mean/standard deviation ratio of the stationary distribution. It concludes that a pure policy optimum exists.

Reviews

Required fields are marked *. Your email address will not be published.