针对数据稀疏性问题,对协同过滤推荐算法作了改进,提出分步预测的算法。算法先对评分矩阵作预处理,重新排列矩阵元素的位置,使评分数据集中到矩阵左上角,并对评分数过少的用户进行部分填充;然后再提取一个数据密度较高的子系统,用基于信任的算法填充其缺失值;最后通过不断向子系统里添加新用户、新项目的方法实现分步预测的目的。通过在Movie Lens数据集上的实验结果表明,新算法可以有效地缓解数据稀疏性问题,提高系统的推荐精度。
The collaborative filtering recommendation algorithm has the problem of data sparseness. In order to solve this problem, this paper put forward a new algorithm with stepwise prediction. It firstly preprocessed the scoring matrix : rearranged the location of the matrix elements to concentrate the values to the left upper corner and filled part of user' s missing data when it scored too less projects. Then it extracted a subsystem with high data density from scoring matrix and filled the missing values by trust-based collaborative filtering algorithm. Finally it achieved stepwise prediction by constantly adding new user or new project. The experimental results on MovieLens demonstrate that the new algorithm can effectively alleviate the data sparseness problem and improve the accuracy.