Article ID: | iaor20121417 |
Volume: | 45 |
Issue: | 1 |
Start Page Number: | 258 |
End Page Number: | 265 |
Publication Date: | Mar 2012 |
Journal: | Accident Analysis and Prevention |
Authors: | Lord Dominique, Geedipally Srinivas Reddy, Dhavala Soma Sekhar |
Keywords: | statistics: regression |
There has been a considerable amount of work devoted by transportation safety analysts to the development and application of new and innovative models for analyzing crash data. One important characteristic about crash data that has been documented in the literature is related to datasets that contained a large amount of zeros and a long or heavy tail (which creates highly dispersed data). For such datasets, the number of sites where no crash is observed is so large that traditional distributions and regression models, such as the Poisson and Poisson‐gamma or negative binomial (NB) models cannot be used efficiently. To overcome this problem, the NB‐Lindley (NB‐L) distribution has recently been introduced for analyzing count data that are characterized by excess zeros. The objective of this paper is to document the application of a NB generalized linear model with Lindley mixed effects (NB‐L GLM) for analyzing traffic crash data. The study objective was accomplished using simulated and observed datasets. The simulated dataset was used to show the general performance of the model. The model was then applied to two datasets based on observed data. One of the dataset was characterized by a large amount of zeros. The NB‐L GLM was compared with the NB and zero‐inflated models. Overall, the research study shows that the NB‐L GLM not only offers superior performance over the NB and zero‐inflated models when datasets are characterized by a large number of zeros and a long tail, but also when the crash dataset is highly dispersed.