The multiplicity of approximation theorems for Neural Networks do not relate to approximation of linear functions per se. The problem for the network is to construct a linear function by superpositions of non-linear activation functions such as the sigmoid function. This issue is important for applications of NNs in statistical tests for neglected nonlinearity, where it is common practice to include a linear function through skip-layer connections. Our theoretical analysis and evidence point in a similar direction, suggesting that the network can in fact provide linear approximations without additional ‘assistance’. Our paper suggests that skip-layer connections are unnecessary, and if employed could lead to misleading results.