Consider a flexible flow shop with s stages in series and at each stage a number of identical machines in parallel. There are n jobs to be processed and each job has to go through the stages following the same route. Job j has release date rj, due date dj, weight wj and a processing time pjt at stage l, l = 1, …, s. The objective is to minimize the total weighted tardiness of the n jobs. In this paper we describe and analyse three heuristics. The first one is a decomposition algorithm that solves the problem by decomposing the flexible flow shop problem into a series of single-stage scheduling subproblems. The second one is an algorithm based on local search. The third heuristic is a hybrid algorithm that combines the first two. We conclude with a comparative study of the three heuristics.