针对现有的用户兴趣聚类方法没有考虑用户标签之间存在的语义相关性问题,提出了一种基于特征映射的微博用户标签兴趣聚类方法。首先,获取待分析用户及其所关注用户的用户标签,选取出现频数高于设定阈值的标签构建模糊矩阵的特征维;然后,考虑标签之间的语义相关性,利用特征映射的思想将用户标签根据其与特征维标签之间的语义相似度映射到每个特征维下,计算每个特征维所对应的特征值;最后,利用模糊聚类得到了不同阈值下的用户兴趣聚类结果。实验结果表明,本文提出的基于特征映射的微博用户标签兴趣聚类方法有效地改善了用户兴趣聚类效果。
Since many methods for cluster user interest does not consider the semantic similarity of the user labels,a micro-blog user label interest clustering method is introduced based on feature mapping.Firstly,the user labels of the target users and their focus users are obtained,then the labels with the higher frequency than the threshold value is chosen.Therefore,a feature space is created.Secondly,the user labels are mapped to the feature space by calculating the semantic similarity based on the feature mapping.Finally,the fuzzy clustering is utilized to obtain the clustering result of different threshold value.Experimental results show that the method greatly improves the clustering accuracy rate for user interest clustering.