Performance loss bounds for approximate value iteration with state aggregation

Performance loss bounds for approximate value iteration with state aggregation

0.00 Avg rating0 Votes
Article ID: iaor200934351
Country: United States
Volume: 31
Issue: 2
Start Page Number: 234
End Page Number: 244
Publication Date: May 2006
Journal: Mathematics of Operations Research
Authors:
Abstract:

We consider approximate value iteration with a parameterized approximator in which the state space is partitioned and the optimal cost–to–go function over each partition is approximated by a constant. We establish performance loss bounds for policies derived from approximations associated with fixed points. These bounds identify benefits to using invariant distributions of appropriate policies as projection weights. Such projection weighting relates to what is done by temporal–difference learning. Our analysis also leads to the first performance loss bound for approximate value iteration with an average–cost objective.

Reviews

Required fields are marked *. Your email address will not be published.