Antecedents of open source software defects: A data mining approach to model formulation, validation and testing

0.00 Avg rating—0 Votes

Article ID:	iaor201095
Volume:	10
Issue:	4
Start Page Number:	235
End Page Number:	251
Publication Date:	Dec 2009
Journal:	Information Technology and Management
Authors:	Raja Uzma, Tretter Marietta J
Keywords:	datamining

Abstract:

This paper develops tests and validates a model for the antecedents of open source software (OSS) defects, using Data and Text Mining. The public archives of OSS projects are used to access historical data on over 5,000 active and mature OSS projects. Using domain knowledge and exploratory analysis, a wide range of variables is identified from the process, product, resource, and end-user characteristics of a project to ensure that the model is robust and considers all aspects of the system. Multiple Data Mining techniques are used to refine the model and data is enriched by the use of Text Mining for knowledge discovery from qualitative information. The study demonstrates the suitability of Data Mining and Text Mining for model building. Results indicate that project type, end-user activity, process quality, team size and project popularity have a significant impact on the defect density of operational OSS projects. Since many organizations, both for profit and not for profit, are beginning to use Open Source Software as an economic alternative to commercial software, these results can be used in the process of deciding what software can be reasonably maintained by an organization.

Reviews

Required fields are marked *. Your email address will not be published.