K. Yu, X. Xu, A. Schwaighofer, V. Tresp, and H.-P. Kriegel
The application range of memory-based collaborative filtering (CF) is limited
due to CF's high memory consumption and long runtime. The approach presented
in this paper removes redundant and inconsistent instances (users) from the
data. Our work shows that a satisfactory accuracy can be achieved by using
only a small portion of the original data set, thereby alleviating the
storage and runtime cost of the CF algorithm. In our approach, we consider
instance selection as the problem of selecting informative data that increase
the a posteriori probability of the optimal model. We evaluate the
empirical performance of our approach on two realworld data sets and attain
very promising results. Data size and prediction time are significantly
reduced, while the prediction accuracy is on a par with results achieved by
using the complete database.