Intensity modulated radiation therapy (IMRT) is one of the most widely used delivery modalities for radiation therapy for cancer patients. A patient is typically treated in daily fractions over a period of 5–9 weeks. In this paper, we consider the problem of accounting for changes in patient setup location and internal geometry between the treatment fractions, usually referred to as interfraction motion. The conventional method is to add a margin around the clinical tumor volume (CTV) to obtain a planning target volume (PTV). A fluence map optimization (FMO) model is then solved to determine the optimal intensity profiles to deliver to the patient. However, a margin‐based method may not adequately model the changes in dose distributions due to the random nature of organ motion. Accounting for interfraction motion in the FMO model essentially transforms the deterministic optimization problem into a stochastic one. We propose a stochastic FMO model that employs convex penalty functions to control the treatment plan quality and uses a large number of scenarios to characterize interfraction motion uncertainties. Some effects of radiotherapy are impacted mainly by the dose distribution in a given treatment fraction while others tend to manifest themselves over time and depend mostly on the total dose received over the course of treatment. We will therefore formulate an optimization model that explicitly incorporates treatment plan evaluation criteria that apply to the total dose received over all treatment fractions and ones that apply to the dose per fraction. Particularly when many structures fall into the former category, this can lead to significant reductions in the dimension of the optimization model and therefore the time required to solve it. We test an example of our model on five clinical prostate cancer cases, showing the efficacy of our approach. In particular, compared to a traditional margin‐based treatment plan, our plans exhibit significantly improved target dose coverage and clinically equivalent critical structure sparing at only a modest increase in computational effort.