We consider the subclass of linear programs that formulate Markov Decision Processes (mdps). We show that the Simplex algorithm with the Gass‐Saaty shadow‐vertex pivoting rule is strongly polynomial for a subclass of mdps, called controlled random walks (CRWs); the running time is O(|S|3·|U|2), where |S| denotes the number of states and |U| denotes the number of actions per state. This result improves the running time of Zadorojniy et al. (2009) algorithm by a factor of |S|. In particular, the number of iterations needed by the Simplex algorithm for CRWs is linear in the number of states and does not depend on the discount factor.