Restless bandits, linear programming relaxations, and a primal–dual index heuristic

Restless bandits, linear programming relaxations, and a primal–dual index heuristic

0.00 Avg rating0 Votes
Article ID: iaor20012014
Country: United States
Volume: 48
Issue: 1
Start Page Number: 80
End Page Number: 90
Publication Date: Jan 2000
Journal: Operations Research
Authors: ,
Keywords: programming: probabilistic, markov processes
Abstract:

We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the first-order relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.

Reviews

Required fields are marked *. Your email address will not be published.