Article ID: | iaor201523853 |
Volume: | 30 |
Issue: | 6 |
Start Page Number: | 891 |
End Page Number: | 903 |
Publication Date: | Oct 2014 |
Journal: | Quality and Reliability Engineering International |
Authors: | Han Xixuan, Clemmensen Line |
Keywords: | statistics: regression, datamining |
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate the capabilities of the doubly weighted approach through an example of prediction of house prices. The weight functions in the house pricing model depend on the geographical distance to the house of interest and the difference in time of sale (CF‐weights) as well as the differences lying in variables (OF‐weights), such as house size and number of floors. The results illustrate that the combination of the two types of weights describes the relative importance of observations very well and lowers the influence of possible outliers. Therefore, it enables the SVR models to have good performance.