Article ID: | iaor19972591 |
Country: | United Kingdom |
Volume: | 28 |
Issue: | 4 |
Start Page Number: | 1051 |
End Page Number: | 1071 |
Publication Date: | Dec 1996 |
Journal: | Advances in Applied Probability |
Authors: | Waterman Michael S., Steel Mike, Goldstein Larry |
Keywords: | probability |
In phylogenetic analysis it is useful to study the distribution of the parsimony length of a tree under the null model, by which the leaves are independently assigned letters according to prescribed probabilities. Except in one special case, this distribution is difficult to describe exactly. Here the authors analyze this distribution by providing a recursive and readily computable description, establishing large deviation bounds for the parsimony length of a fixed tree on a single site and for the minimum length (maximum parsimony) tree over several sites. They also show that, under very general conditions, the former distribution converges asymptotically to the normal, thereby settling a recent conjecture. Furthermore, the authors show how the mean and variance of this distribution can be efficiently calculated. The proof of normality requires a number of new and recent results, as the parsimony length is not directly expressible as a sum of independent random variables, and so normality does not follow immediately from a standard central limit theorem.