Article ID: | iaor19961760 |
Country: | Germany |
Volume: | 42 |
Issue: | 2 |
Start Page Number: | 169 |
End Page Number: | 188 |
Publication Date: | Sep 1995 |
Journal: | Mathematical Methods of Operations Research (Heidelberg) |
Authors: | Altman E., Spieksma F. |
Linear Programming is known to be an important and useful tool for solving Markov Decision Processes (MDP). Its derivation relies on the Dynamic Programming approach, which also serves to solve MDP. However, for Markov Decision Processes with several constraints the only available methods are based on Linear Programs. The aim of this paper is to investigate some aspects of such Linear Programs, related to multi-chain MDPs. The authors first present a stochastic interpretation of the decision variables that appear in the Linear Programs available in the literature. They then show for the multi-constrained Markov Decision Process that the Linear Program can be obtained from a equivalent unconstrained Lagrange formulation of the control problem. This shows the connection between the Linear Program approach and the Lagrange approach, that was previously used only for the case of a single constraint.