Article ID: | iaor20115021 |
Volume: | 213 |
Issue: | 1 |
Start Page Number: | 124 |
End Page Number: | 133 |
Publication Date: | Aug 2011 |
Journal: | European Journal of Operational Research |
Authors: | Ohno Katsuhisa |
Keywords: | programming: markov decision, programming: dynamic, simulation: applications |
In just‐in‐time (JIT) production systems, there is both input stock in the form of parts and output stock in the form of product at each stage. These activities are controlled by production‐ordering and withdrawal kanbans. This paper discusses a discrete‐time optimal control problem in a multistage JIT‐based production and distribution system with stochastic demand and capacity, developed to minimize the expected total cost per unit of time. The problem can be formulated as an undiscounted Markov decision process (UMDP); however, the curse of dimensionality makes it very difficult to find an exact solution. The author proposes a new neuro‐dynamic programming (NDP) algorithm, the simulation‐based modified policy iteration method (SBMPIM), to solve the optimal control problem. The existing NDP algorithms and SBMPIM are numerically compared with a traditional UMDP algorithm for a single‐stage JIT production system. It is shown that all NDP algorithms except the SBMPIM fail to converge to an optimal control. Additionally, a new algorithm for finding the optimal parameters of pull systems is proposed. Numerical comparisons between near‐optimal controls computed using the SBMPIM and optimized pull systems are conducted for three‐stage JIT‐based production and distribution systems. UMDPs with 42 million states are solved using the SBMPIM. The pull systems discussed are the kanban, base stock, CONWIP, hybrid and extended kanban.