Denumerable controlled Markov chains with average reward criterion: Sample path optimality

Denumerable controlled Markov chains with average reward criterion: Sample path optimality

0.00 Avg rating0 Votes
Article ID: iaor19952258
Country: Germany
Volume: 41
Start Page Number: 89
End Page Number: 108
Publication Date: Mar 1995
Journal: Mathematical Methods of Operations Research (Heidelberg)
Authors: ,
Abstract:

The authors consider discrete-time nonlinear controlled stochastic systems, modeled by controlled Markov chains with denumerable state space and compact action space. The corresponding stochastic control problem of maximizing average rewards in the long-run is studied. Departing from the most common position which uses expected values of rewards, the authors focus on a sample path analysis of the stream of states/rewards. Under a Lyapunov function condition, they show that stationary policies obtained from the average reward optimality equation are not only average reward optimal, but indeed sample path average reward optimal, for almost all sample paths.

Reviews

Required fields are marked *. Your email address will not be published.