目前通行的社区划分方法大多基于结构,但单纯基于结构的划分不能挖掘出社区对象的潜在关系,因而不能发现社区的变化趋势。为此,提出了基于结构的社区划分算法(Community Division based on Structure,CDS)。该算法利用度和节点欧氏距离对社会网络进行结构划分;同时针对经典K-means算法在社区划分中所存在的随机选取初始中心点以及k值选取不合理所导致的聚类结果不佳问题,提出了一种基于社区结构的非人为设定k值的K—means算法-NPCluster(Non Presetting Cluster)算法。该算法基于由CDS算法所提到的社区结构,依次选取度最大的节点作为聚类中心点,以小于平均特征欧氏距离为基准合并簇集,反复迭代直至聚类完成。理论分析和对比实验结果表明,CDS算法能够有效划分出社区结构;相对于K—means算法,NPCIuster算法在已划分的社区结构上具有更高的聚类精度和更好的时效性;结构与属性相结合的社区划分方法是有效可行的。
Most of the current methods of community division are based on structure, but the structure-based division cannot excavate the potential relationship of community objects, which is not to find the tendencies of community variations. Therefore a community-based partitioning algorithm (Community Division based on Structure, CDS) has been designed which applies degree and node-Euclidean dis- tance to divide social network. Simultaneously, an algorithm by nonhuman (artificial) setting k -value--NPCluster algorithm (Non Pre- setting Cluster) --based on the community structure has been proposed, which is based on the community structures divided by CDS algo- rithm and has irnproved the unsatisfactory clustering outcomes caused by the inappropriateness of random selection of initial centers and that of Kvalue. Thus the maximum degree nodes are chosen as a cluster center in turn and the data are merged and clustered until the aver- age feature-Euclidean distance is less than a given threshold. Theoretical analyses and experimental results show that the proposed CDS algorithm can effectively divide the community structures; compared with K -means algorithm, NPCluster algorithm has higher clustering quality and better clustering timeliness on the divided community;the community division method based on structure and attribute is practical and effective.