Article ID: | iaor2005785 |
Country: | Netherlands |
Volume: | 37 |
Issue: | 4 |
Start Page Number: | 461 |
End Page Number: | 474 |
Publication Date: | Sep 2004 |
Journal: | Decision Support Systems |
Authors: | Bhattacharyya Siddhartha, Mehta Kumar |
Keywords: | neural networks, datamining |
A crucial issue related to data mining on time-series is that of training period duration. The training horizon used impacts the nature of rules obtained and their predictability over time. Longer training horizons are generally sought, in order to discern sustained patterns with robust training data performance that extends well into the predictive period. However, in dynamic environments patterns that persist over time may be unavailable and shorter-term patterns may hold higher predictive ability, albeit with shorter predictive periods. Such potentially useful shorter-term patterns may be lost when the training duration covers much longer periods. Too short a training duration can, of course, be susceptible to over-fitting to noise. We conduct experiments using different training horizons with daily-data for the S&P500 index and report the sensitivity of the performance of the obtained rules with respect to the training durations. We show that while the performance of the rules in the training period is important for inducing the “best” rules, it is not indicative of their performance in the test-period and propose alternative measures that can be used to help identify the appropriate training durations.