由于缺乏足够的反映用户兴趣的知识,以及巨大的在线计算量,导致互联网上现有文章自动推荐系统普遍存在盲目性和低效性的问题.针对以上问题,提出了一种基于聚类和分类的个性化文章自动推荐系统,利用机器学习的方法隐式地获取用户模型,并根据用户模型为用户提供个性化的文章自动推荐服务.该系统包括离线用户模型及用户群获取子系统和在线个性化文章推荐子系统两大部分,前者对文章进行聚类形成聚类兴趣点,构建基于聚类兴趣点的用户模型,并根据用户兴趣聚类形成各兴趣点的用户群;后者对待推荐文章进行分类,搜索到其所属的兴趣点,向该兴趣点的用户群进行主动推荐.理论分析和实验结果表明,该系统能够显著提高有效性和在线响应速度.所述的设计思想和技术也适用于其它互联网个性化信息自动推荐系统.
Because of the lack of the adequate knowledge of users' interests and huge on-line computational demand, most existing document recommendation systems are not very effective and efficient. In this paper, a novel automatic personalized document recommendation system based on clustering and classification is proposed. The proposed system learns user profiles by employing machine learning methods and provides the personalized document recommendation services for each registered user based on the user profile. In order to improve the recommendation' s quality and reduce the on-line computation, the proposed system comprises off-line user profile and user group generation subsystem and on-line personalized document recommendation subsystem. The off-line user profile and user group generation subsystem clusters the documents into clusters called interest clusters, generates user profiles based on the interest clusters and clusters users into user groups based on the user profiles. The on-line personalized document recommendation subsystem classifies the new document and recommends the new document to the users who are interested in the interest cluster that the new document belongs to. The theoretical analysis and experimental results show that the system can improve the effectiveness and the real-time performance. The proposed idea and technique can also be used for other personalized information recommendation systems on the Internet.