The process by which individuals learn from feedback when making recurrent choices among ambiguous alternatives is explored. The authors describe an experiment in which subjects solve a variant of the classic armed-bandit problem of dynamic decision theory, set in the context of airline choice. Subjects are asked to make repeated choices between two hypothetical airlines, one having an on-time departure probability which is known a priori, and the other has an ambiguous probability whose true value can only be discovered by making sample trips on the airline. Subjects attempt to make choices in such a way as to maximize the total number of one-time departures over a fixed planning horizon. The authors examine the extent to which actual choice patterns over time are consistent with those which would be made by a decision maker acting as an optimal Bernoulli sampler. The data offer support for a number of expected-and some unexpected-departures from optimality, including a tendency to underexperiment with promising options and overexperiment with unpromising options, and a tendency to increasingly switch between airlines as the average base rate of departures decreases. Implications of the work for the descriptive validity of normative dynamic decision models is explored, as well as for the generalizability of previous findings about choice under ambiguity to dynamic settings.