大多数的传统推荐系统是基于用户评分构建,并采用离线批量的训练模式.该文研究以下两个问题:(1)基于隐式用户反馈构建推荐系统.与显式评分相比,隐式反馈存在范围更广且更易于收集;(2)基于反馈数据流进行实时推荐,以此来保障更强的推荐时效性.为了克服由隐式反馈本质特征导致的不平衡类标问题,直接对可观察的用户选择行为进行概率建模,在训练时无需引入负样本.为了提高训练效率并及时抓住用户兴趣的变化,该文提出的在线学习算法在强化学习用户新倾向的同时弱化了学习用户惯常行为与噪声,通过比较反馈发生概率与用户置信度来为每一个反馈动态调节学习步长.最后,该文设计了在线评价机制,并在两个真实数据集上进行了丰富的实验.实验结果验证了所提方法的有效性,并展示了其在推荐精度、推荐多样性、可解释性、训练效率、健壮性以及冷启动适应能力等多个方面的优势.
Most traditional recommender systems are built based on ratings and trained off-line in batch mode. This paper addressed two challenges. (1) building recommendation model based on implicit user feedback which is more widespread and easier to collect comparing to rating, and (2) making real-time recommendation in a stream setting for stronger timeliness. To overcome the unbalance class problem arising from the nature of implicit feedback, we directly modeled the observed user adoptions in the probabilistic framework and avoided introducing negative samples. To increase training efficiency and capture user drifting taste in time, we performed online learning of the proposed model. The online model reinforces to learn the new trend while weakens to learn the habitual feedback and noise. The key idea is to dynamically adjust the learning step for each feedback by comparing feedback occurrence probability and user confidence. Finally, we designed an online evaluation mechanism and conducted comprehensive experiments on two real world datasets to validate the effectiveness of our proposed models. The experiment results show the advantages of the proposed methods on recommendation accuracy, diversity, interpretability, training efficiency and robustness. Moreover, our online model naturally addresses cold-start problem.