Constrained Semi-Markov Decision Processes with average rewards

Constrained Semi-Markov Decision Processes with average rewards

0.00 Avg rating0 Votes
Article ID: iaor19952246
Country: Germany
Volume: 39
Start Page Number: 257
End Page Number: 288
Publication Date: Jun 1994
Journal: Mathematical Methods of Operations Research (Heidelberg)
Authors:
Abstract:

This paper deals with constrained average reward Semi-Markov Decision Processes with finite state and action sets. The paper considers two average reward criteria. The first criterion is time-average rewards, which equal the lower limits of the expected average rewards per unit time, as the horizon tends to infinity. The second criterion is ratio-average rewards, which equal the lower limits of the ratios of the expected total rewards during the first n steps to the expected total duration of these n steps as n⇒•. For both criteria, the paper proves the existence of optimal mixed stationary policies for constrained problems when the constraints are of the same nature as the objective functions. For unichain problems, it shows the existence of randomized stationary policies which are optimal for both criteria. However, optimal mixed stationary policies may be different for each of these criteria even for unichain problems. The paper provides linear programming algorithms for the computation of optimal policies.

Reviews

Required fields are marked *. Your email address will not be published.