A policy improvement method for constrained average Markov decision processes

A policy improvement method for constrained average Markov decision processes

0.00 Avg rating0 Votes
Article ID: iaor20083421
Country: Netherlands
Volume: 35
Issue: 4
Start Page Number: 434
End Page Number: 438
Publication Date: Jul 2007
Journal: Operations Research Letters
Authors:
Abstract:

This brief paper presents a policy improvement method for constrained Markov decision processes (MDPs) with average cost criterion under an ergodicity assumption, extending Howard's policy improvement for MDPs. The improvement method induces a policy iteration-type algorithm that converges to a local optimal policy.

Reviews

Required fields are marked *. Your email address will not be published.