Motivated by the trade‐off between reliability and utilization level of a stochastic service system, we considers a Markovian multi‐server vacation queueing system with c unreliable servers. In such a system, some servers may not be available due to either planned stoppage (vacations) or unplanned service interruptions (server failures). The vacations are controlled by a threshold policy. With this policy, at a service completion instant, if d (⩽c) servers become idle, they take a vacation together and will keep taking vacations until they find at least c − d +1 customers are in the system at a vacation completion instant, and then they return to serve the queue. In addition, all on‐duty servers are subject to failures and can be repaired within a random period of time. We formulate a quasi‐birth–death (QBD) process, establish the stability condition, and develop a computational algorithm to obtain the stationary performance measures of the system. Numerical examples are presented to show the performance evaluation and optimization of such a system. The insights gained from this model help practitioners make capacity and operating decisions for this type of waiting line systems.