为提高聚类算法在对精度要求不高的大型数据集上的运行效率,通过比较各类聚类算法。提出了部分优先聚类算法,给出了部分优先聚类算法的相对优势和性能比较表;分析聚类成员产生方式和聚类融合方式来设计共识函数,在部分优先聚类算法的基础上,通过使用加权的方式来确定类中心后进行聚类融合,提高算法的精确度。实验结果表明了融合后的算法无论在扩展性、稳定性以及鲁棒性等方面都有着明显优势。
To improve the efficiency of clustering algorithm through the comparison of all kinds of clustering algorithms for calculating on large data sets that call for less accuracy, part priority clustering algorithm is proposed. Though designing consensus function by further analyzing ensemble method of clustering members and clustering, then the class center will be determined by the way of using the weighted method. Clustering ensemble which is based on part priority clustering algorithm, is to improve the accuracy of the algorithm. The analysis of experimental results prove that the given algorithm have obvious advantages in scalability, stability and robustness.