为证明不确定性的存在对聚类结果不可忽略的影响,改进了基于能量模型布局和模块化聚类的算法LinLogLayout,使之可以处理不确定图数据。提出了不确定图的定义并产生满足Zipf分布的不确定图数据,对确定算法进行不确定化使之满足应用要求。实验结果表明,不论是在确定图数据、不确定图数据还是人工数据集、真实数据集上,改进的LinLogLayout算法都具有较好的聚类效果。实验结果也表明,不确定性的存在对聚类结果具有不可忽略的影响。
In order to indicate that the presence of uncertainty has a clustering effect can not be ignored, this paper improved a algorithm called LinLogLayout which optimized LinLog and related energy models to compute layouts, and Newman and Girvan' s Modularity to compute clusterings and enabled it to deal with uncertain graphs. In addition, it proposed an explicit definition of uncertain graph and generated uncertain graphs subjeet to Zipf distribution, and then related improvements made to the algorithm in order to meet the requirements. After evaluation on both certain graphs and uncertain graphs, synthetic datasets and real datasets, it demonstrates that the improved LinLogLayout algorithm can handle both certain and uncertain graphs well, meanwhile the results indicate that the presence of uncertainty has a clustering effect can not be ignored.