A stochastic approximation for parameters of the Markov decision processes

A stochastic approximation for parameters of the Markov decision processes

0.00 Avg rating0 Votes
Article ID: iaor20051045
Country: China
Volume: 25
Issue: 5
Start Page Number: 377
End Page Number: 380
Publication Date: Sep 2003
Journal: Journal of Yunnan University
Authors:
Abstract:

A stochastic gradient algorithm for average reward of the Markov decision processes that depends on a parameter vector is proposed. A new gradient for the objective function is given and a stochastic approximation algorithm that is based on a single sample path is presented. Finally, a proof of convergence of the gradient (with probability 1) is provided.

Reviews

Required fields are marked *. Your email address will not be published.