Article ID: | iaor201112581 |
Volume: | 38 |
Issue: | 4 |
Start Page Number: | 650 |
End Page Number: | 665 |
Publication Date: | Dec 2011 |
Journal: | Scandinavian Journal of Statistics |
Authors: | Liu Hai, Chan Kung-Sik |
Keywords: | statistics: general, statistics: sampling, statistics: regression, probability, datamining, heuristics |
Zero-inflated data abound in ecological studies as well as in other scientific fields. Non-parametric regression with zero-inflated response may be studied via the zero-inflated generalized additive model (ZIGAM) with a probabilistic mixture distribution of zero and a regular exponential family component. We propose the (partially) constrained ZIGAM, which assumes that some covariates affect the probability of non-zero-inflation and the regular exponential family distribution mean proportionally on the link scales. When the assumption obtains, the new approach provides a unified framework for modelling zero-inflated data, which is more parsimonious and efficient than the unconstrained ZIGAM. We develop an iterative estimation algorithm, and discuss the confidence interval construction of the estimator. Some asymptotic properties are derived. We also propose a Bayesian model selection criterion for choosing between the unconstrained and constrained ZIGAMs. The new methods are illustrated with both simulated data and a real application in jellyfish abundance data analysis.