Missing Value Imputation with Unsupervised Backpropagation

0.00 Avg rating—0 Votes

Article ID:	iaor20161489
Volume:	32
Issue:	2
Start Page Number:	196
End Page Number:	215
Publication Date:	May 2016
Journal:	Computational Intelligence
Authors:	Gashler Michael S, Smith Michael R, Morris Richard, Martinez Tony
Keywords:	statistics: general, statistics: regression, matrices, learning

Abstract:

Many data mining and data analysis techniques operate on dense matrices or complete tables of data. Real‐world data sets, however, often contain unknown values. Even many classification algorithms that are designed to operate with missing values still exhibit deteriorated accuracy. One approach to handling missing values is to fill in (impute) the missing values. In this article, we present a technique for unsupervised learning called unsupervised backpropagation (UBP), which trains a multilayer perceptron to fit to the manifold sampled by a set of observed point vectors. We evaluate UBP with the task of imputing missing values in data sets and show that UBP is able to predict missing values with significantly lower sum of squared error than other collaborative filtering and imputation techniques. We also demonstrate with 24 data sets and nine supervised learning algorithms that classification accuracy is usually higher when randomly withheld values are imputed using UBP, rather than with other methods.

Reviews

Required fields are marked *. Your email address will not be published.