分析了现有基于网格的聚类算法,该算法具有高效且可以处理高维数据的特点,但传统网格聚类算法的聚类质量受网格划分的粒度影响较大。为此,提出了一种基于网格的增量聚类算法IGrid。IGrid算法具有传统网格聚类算法的高效性,且通过维度半径对网格空间进行了动态增量划分以提高聚类的质量。在真实数据集与仿真数据集上的实验结果表明,IGrid算法在聚类准确度以及效率上要高于传统的网格聚类算法。
This paper analyzed the existing clustering algorithms based on grid, and the clustering algorithms based on grid had the advantages of dealing with high dimensional data and high efficiency. However, traditional algorithms based on grid were influenced greatly by the granularity of grid partition. It proposed an incremental clustering algorithm based on grid, which was called IGrid. IGrid had the advantage of high efficiency of traditional clustering algorithms based on grid, and it also partitioned the grid space by dimensional radius in a dynamic and incremental manner to improve the quality of clustering. The experiments on real datasets and synthetic datasets show that IGrid has better performance than traditional clustering algorithms based on grid in both speed and accuracy.