Article ID: | iaor20118876 |
Volume: | 215 |
Issue: | 3 |
Start Page Number: | 662 |
End Page Number: | 669 |
Publication Date: | Dec 2011 |
Journal: | European Journal of Operational Research |
Authors: | Johnson Andrew L, Nataraja Niranjan R |
Keywords: | statistics: regression |
Model misspecification has significant impacts on data envelopment analysis (DEA) efficiency estimates. This paper discusses the four most widely‐used approaches to guide variable specification in DEA. We analyze efficiency contribution measure (ECM), principal component analysis (PCA‐DEA), a regression‐based test, and bootstrapping for variable selection via Monte Carlo simulations to determine each approach’s advantages and disadvantages. For a three input, one output production process, we find that: PCA‐DEA performs well with highly correlated inputs (greater than 0.8) and even for small data sets (less than 300 observations); both the regression and ECM approaches perform well under low correlation (less than 0.2) and relatively larger data sets (at least 300 observations); and bootstrapping performs relatively poorly. Bootstrapping requires hours of computational time whereas the three other methods require minutes. Based on the results, we offer guidelines for effectively choosing among the four selection methods.