提出了一种通过降低用户评分矩阵维数来解决数据稀疏问题的协同过滤算法(基于项目多类属概率潜在语义的协同过滤算法).首先将概率潜在语义分析法中的隐变量集固定为项目的多类属集,明确隐变量的意义,限制隐变量的变化范围;而后迭代学习隐变量的分布,即用户的兴趣模型,压缩用户评分矩阵;最后用学到的兴趣模型度量用户的相似度,对目标用户做出推荐.仿真实验结果表明:该算法有效解决了数据稀疏问题,平均绝对误差低干基于记忆的协同过滤算法4%;与通过概率潜在语义分析法降低用户评分矩阵维数来解决数据稀疏问题的协同过滤算法相比,该算法明确了隐变量的意义,提高了对系统的理解,并取得了富有竞争力的推荐性能.
The paper proposes a novel memory-based collaborative filtering algorithm--Multi-label Probabilistic Latent Semantic Analysis based Collaborative Filtering, which improves the quality of recommendations by reducing the dimension of the user-rating-data matrix by multi-label probabilistic latent semantic analysis when the matrix is extremely sparse. Firstly, it confines the set of latent variables of probability latent semantic analysis to the set of multi-label of items to make latent variables have meanings of corresponding labels. Then it learns the probabilistic distribution of latent variables, i. e. , the model of use's interest, to compress the user-rating-data matrix. Finally, it computes the similarity between different users based on the above learned model and makes recommendations. Compared to memory-based collaborative filtering algorithms, the proposed algorithm decreases the mean absolute error 4 percents averagely on test dataset by reducing the dimension of the user-rating-data matrix. The proposed algorithm makes the recommendation system understandable and obtains competitive recommendations compared to the filtering algorithm which reduces the dimension of the user-rating-data matrix by probabilistic latent semantic analysis.