针对社会标签系统中不同粒度的特征在表示文档时具有不同的描述能力这一特性,提出从词粒度和话题粒度来推荐社会标签以提高标签推荐的准确度。提出使用统计语言模型(词粒度)和隐含话题模型(话题粒度)分别建模文档的描述集和标签集,首先使用单个模型进行标签推荐,然后融合不同的特征粒度进行标签推荐。实验结果表明:就单一方法讲,基于统计语言模型的推荐性能要比基于话题粒度模型的推荐性能好;基于两种方法的混合方法的性能要好于没有混合的基于话题的单个方法;涉及较少特征的混合方法的推荐性能要优于涉及较多特征的混合方法。
Social tagging system has a characteristic that different entities from different grain have different descriptive power.This paper proposed some methods to recommend more precise tags from fine word-grained and coarse topic-grained according to this characteristic.The descriptions and tags of documents were modeled with statistic language model(fine word-grained) and latent dirichlet allocate model(coarse topic-grained),respectively.The paper hybrided different single model to recommend tags after using a single model,and then compared their different performances.The results of experiments show that the performance of word-grained tag recommendation is better than the topic-grained one,and the hybrid methods are better than non-hybrid ones,and the less the related features of hybrid are,the better the performance is obtained.