提出一种基于词项关联关系与归一化割加权非负矩阵分解的微博用户兴趣模型构建方法。该方法首先基于词分布上下文语义相关性来建立词项关联关系矩阵刻画词项间相似度,然后应用归一化割加权非负矩阵分解算法获取用户—主题矩阵,产生用户感兴趣的微博主题聚类结果。实验表明,此方法能有效地进行微博主题聚类,并支持微博用户兴趣模型构建。
This paper proposed a non-negative matrix factorization based on the term correlation and normalized cut weighting for miero-blog user interest model. First, it constructed a term correlation matrix using term distribution context to better ex- plain similarities of terms, and then presented a Ncut-weighted non-negative matrix factorization ( NCUT_WEIGHTED NMF) method to obtain the matrix of user-topic ,which showed the clustering results of interest to the user. Experiments show that this method can effectively cluster micro-blog topic to support miero-blog user interest model.