针对传统协同过滤推荐算法遇到冷启动情况效果不佳的问题,提出一种基于项目相似性度量方法(IPSS)的项目协同过滤推荐算法(ICF_IPSS),其核心是一种新的项目相似性度量方法,该方法由评分相似性和结构相似性两部分构成:评分相似性部分充分考虑两个项目评分之间的评分差、项目评分与评分中值之差,以及项目评分与其他评分平均值之差;结构相似性部分定义了共同评分项目占所有项目比重,并惩罚活跃用户的逆项目频率(IIF)系数。在Movie Lens和Jester数据集下测试算法准确率。在Movie Lens数据集下,当近邻数量为10时,ICF_IPSS的平均绝对偏差(MAE)和均方根误差(RMSE)分别比基于Jaccard系数的均方差异系数的项目协同过滤算法(ICF_JMSD)低3.06%和1.20%;当推荐项目数量为10时,ICF_IPSS的准确率和召回率分别比ICF_JMSD提升67.79%和67.86%。实验结果表明,基于IPSS的项目协同过滤算法在预测准确率和分类准确率方面均优于基于传统相似性度量的项目协同过滤算法,如ICF_JMSD等。
Traditional collaborative filtering algorithm can not perform well under the condition of cold start. To solve this problem, IPSS-based (Inverse Item Frequence-based Proximity-Significance-Singularity) Item Collaborative Filtering (ICF_ IPSS) was proposed, whose core was a novel similarity measure. The measure was composed of the rating similarity and the structure similarity. The difference between the ratings of two items, the difference between the item rating and the median value, and the difference between the rating value and the average rating value of other items were taken into account in the rating similarity. The structure similarity defined the IIF (Inverse Item Frequence) coefficient which fully reflected common- rating ratio and punished active users. Experiments were executed on Movie Lens and Jester data sets to testify the accuracy of the ICF_IPSS. In Movie Lens data set, when the nearest neighbor number was 10, the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) was 3.06%, 1.20% lower than ICF_JMSD ( Jaceard-based Mean Square Difference-based Item Collaborative Filtering) respectively. When the recommendation item number was 10, the precision and recall was 67.79%, 67.86% higher than ICF_JMSD respectively. The experimental results show that ICF_IPSS is superior to other traditional collaborative filtering algorithms, such as ICFjMSD.