Article ID: | iaor20119401 |
Volume: | 36 |
Issue: | 2 |
Start Page Number: | 205 |
End Page Number: | 218 |
Publication Date: | Oct 2011 |
Journal: | Journal of Productivity Analysis |
Authors: | Simar Lopold, Wilson W |
Keywords: | statistics: data envelopment analysis |
This paper examines the wide‐spread practice where data envelopment analysis (DEA) efficiency estimates are regressed on some environmental variables in a second‐stage analysis. In the literature, only two statistical models have been proposed in which second‐stage regressions are well‐defined and meaningful. In the model considered by Simar and Wilson (2007), truncated regression provides consistent estimation in the second stage, where as in the model proposed by Banker and Natarajan (2008), ordinary least squares (OLS) provides consistent estimation. This paper examines, compares, and contrasts the very different assumptions underlying these two models, and makes clear that second‐stage OLS estimation is consistent only under very peculiar and unusual assumptions on the data‐generating process that limit its applicability. In addition, we show that in either case, bootstrap methods provide the only feasible means for inference in the second stage. We also comment on ad hoc specifications of second‐stage regression equations that ignore the part of the data‐generating process that yields data used to obtain the initial DEA estimates.