| Article ID: | iaor2004689 |
| Country: | United States |
| Volume: | 22 |
| Issue: | 1 |
| Start Page Number: | 222 |
| End Page Number: | 255 |
| Publication Date: | Feb 1997 |
| Journal: | Mathematics of Operations Research |
| Authors: | Burnetas A.N., Katehakis M.N. |
In this paper we consider the problem of adaptive control for Markov Decision Processes. We give the explicit form for a class of adaptive policies that possess optimal increase rate properties for the the total expected finite horizon reward, under sufficient assumptions of finite state-action spaces and irreducibility of the transition law. A main feature of the proposed policies is that the choice of actions, at each state and time period, is based on indices that are inflations of the right-hand side of the estimated average reward optimality equations.