针对协同过滤、基于内容过滤等个性化推荐方法所存在的用户隐私数据收集、冷启动等问题,提出一种群体兴趣及其关联性的挖掘方法,并应用于推荐领域.以维基百科作为数据源,获取用户社团及其编辑的词条,设计了以词条及其所属类别为基础的泛树结构生长策略,使用泛树结构表征用户社团所对应的兴趣点.结合用户社团的结构特征和兴趣点的语义特征给出了用户社团对兴趣点的关注度及兴趣点间关联性的定义,用此群体兴趣取代个性化推荐方法中的个体兴趣,进行了人工直观评价、测试集对比以及视频点播中的新闻推荐等三种实验.结果表明,测试集上群体兴趣关联性的准确度达到了50%,高于基准协同推荐方法的准确度;新闻推荐实验中,本方法比按热度推荐方法获得了高出近一倍的点击率,验证了群体兴趣及其关联性的合理性.
Personalized recommendation technologies,such as collaborative filtering and content based filtering,face some problems.The obvious ones are the privacy history data collection and cold start.In this paper,we suggest a group interests mining method from Wikipedia.We also apply the group interests into the recommendation system,which avoid the cold start,and don't need any privacy data.Here,the group interest replaces the personalized interest in the traditional personalized recommendation technologies.In detail,we first suggest a general tree structure and a growing strategy to denote the interest of a users group,which includes the semantic relationship of each interest.Then we define the group interest based on the structure of users groups.At last,we measure the correlations of interests according to the general tree structure of interests.We further design three types of experiment to evaluate the reasonability of group interests,which is manual evaluation,test set evaluation and a news recommendation experiment in video service.The results show that,the accuracy of correlation between group interests can be more than 50%,and the news hits rate on the recommendation from group interests is 2 times larger than that on the recommendation from news popularity.