Motivated by the lack of evidence supporting the conjecture that the back-propagation neural network (BPNN) is a universal approximator thus it can perform at least comparably to linear models on linear data, this study is designed to answer two primary research questions, namely, ‘how does the BPNN perform with respect to various underlying ARMA(p,q) structures?’ and ‘how does the level of noise in the training time series affect the BPNN's performance?’ The goal is to understand better the modelling and forecasting ability of BPNNs on a special class of time series and suggest proper training strategies to improve performance. Using Box–Jenkins models' performance as a benchmark, it is concluded that BPNNs generally performed well and consistently for time series corresponding to ARMA(p,q) structures. BPNNs' ability to model and forecast is not affected by the number of parameters but by the magnitude of the coefficients of the underlying structure. Overall, BPNNs perform significantly better for most of the structures when a particular noise level is considered during network training. Therefore, a proper strategy is to train networks at a noise level consistent in magnitude with the time series' sample standard deviation.