Article ID: | iaor201523889 |
Volume: | 30 |
Issue: | 8 |
Start Page Number: | 1445 |
End Page Number: | 1459 |
Publication Date: | Dec 2014 |
Journal: | Quality and Reliability Engineering International |
Authors: | Joh HyunChul, Malaiya Yashwant K |
Keywords: | simulation, statistics: distributions |
A vulnerability discovery model attempts to model the rate at which the vulnerabilities are discovered in a software product. Recent studies have shown that the S‐shaped Alhazmi–Malaiya Logistic (AML) vulnerability discovery model often fits better than other models and demonstrates superior prediction capabilities for several major software systems. However, the AML model is based on the logistic distribution, which assumes a symmetrical discovery process with a peak in the center. Hence, it can be expected that when the discovery process does not follow a symmetrical pattern, an asymmetrical distribution based discovery model might perform better. Here, the relationship between performance of S‐shaped vulnerability discovery models and the skewness in target vulnerability datasets is examined. To study the possible dependence on the skew, alternative S‐shaped models based on the Weibull, Beta, Gamma and Normal distributions are introduced and evaluated. The models are fitted to data from eight major software systems. The applicability of the models is examined using two separate approaches: goodness of fit test to see how well the models track the data, and prediction capability using average error and average bias measures. It is observed that an excellent goodness of fit does not necessarily result in a superior prediction capability. The results show that when the prediction capability is considered, all the right skewed datasets are represented better with the Gamma distribution‐based model. The symmetrical models tend to predict better for left skewed datasets; the AML model is found to be the best among them.