Article ID: | iaor20042885 |
Country: | Netherlands |
Volume: | 147 |
Issue: | 1 |
Start Page Number: | 51 |
End Page Number: | 61 |
Publication Date: | May 2003 |
Journal: | European Journal of Operational Research |
Authors: | Jenkins Larry, Anderson Murray |
The usefulness of data envelopment analysis (DEA) depends on its ability to calculate the relative efficiency of decision making units (DMUs) using multiple inputs and outputs. Unfortunately, the greater the number of input and output variables, the less discerning the analysis. In practice, the input and output variables are usually highly correlated with one another, often reflecting no more than the relative size of each DMU. To counteract the limited distinction provided by a DEA with many variables, analysts for many years have taken the approach of retaining only some of the variables originally planned for the analysis omitting, on an ad hoc basis, variables that are highly correlated with those retained. In this paper, we describe a systematic statistical method for deciding which of the original correlated variables can be omitted with least loss of information, and which should be retained. Results on a number of published datasets reveal that even omitting variables that are highly correlated, and thereby contain little additional information on performance, can have a major influence on the computed efficiency measures.