Article ID: | iaor201112556 |
Volume: | 38 |
Issue: | 2 |
Start Page Number: | 268 |
End Page Number: | 287 |
Publication Date: | Jun 2011 |
Journal: | Scandinavian Journal of Statistics |
Authors: | Liquet Benoit, Commenges Daniel |
Keywords: | health services, decision, decision: rules, statistics: decision, datamining, simulation: applications, simulation: analysis, risk |
It is quite common in epidemiology that we wish to assess the quality of estimators on a particular set of information, whereas the estimators may use a larger set of information. Two examples are studied: the first occurs when we construct a model for an event which happens if a continuous variable is above a certain threshold. We can compare estimators based on the observation of only the event or on the whole continuous variable. The other example is that of predicting the survival based only on survival information or using in addition information on a disease. We develop modified Akaike information criterion (AIC) and Likelihood cross-validation (LCV) criteria to compare estimators in this non-standard situation. We show that a normalized difference of AIC has a bias equal to o(n-1) if the estimators are based on well-specified models; a normalized difference of LCV always has a bias equal to o(n-1). A simulation study shows that both criteria work well, although the normalized difference of LCV tends to be better and is more robust. Moreover in the case of well-specified models the difference of risks boils down to the difference of statistical risks which can be rather precisely estimated. For ‘compatible’ models the difference of risks is often the main term but there can also be a difference of mis-specification risks.