Restless bandits, linear programming relaxations, and a primal–dual index heuristic

0.00 Avg rating—0 Votes

Article ID:	iaor20012014
Country:	United States
Volume:	48
Issue:	1
Start Page Number:	80
End Page Number:	90
Publication Date:	Jan 2000
Journal:	Operations Research
Authors:	Bertsimas Dimitris, Nio-Mora Jos
Keywords:	programming: probabilistic, markov processes

Abstract:

We develop a mathematical programming approach for the classical PSPACE-hard restless bandit problem in stochastic optimization. We introduce a hierarchy of N (where N is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the first-order relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.

Reviews

Required fields are marked *. Your email address will not be published.