K-Means聚类算法的结果质量依赖于初始聚类中心的选择。该文将局部搜索的思想引入K-Means算法,提出一种改进的KMLS算法。该算法对K-Means收敛后的结果使用局部搜索来使其跳出局部极值点,进而再次迭代求优。同时对局部搜索的结果使用K-Means算法使其尽快到达一个局部极值点。理论分析证明了算法的可行性和有效性,而在标准文本集上的文本聚类实验表明,相对于传统的K-Means算法,该算法改进了聚类结果的质量。
The quality of K-Means clustering algorithm depends on the choice of cluster center. This paper introduces the idea of local search mechanism into K-Means and presents a KMLS algorithm. This algorithm uses the local search mechanism to jump out one local critical point obtained by K-Means, and uses K-Means to quickly find another local critical point. Experiments of text Clustering in standard document sets show that this algorithm achieves a better clustering result than the traditional K-Means algorithm does.