针对图像聚类中数据量大、部分重叠等问题,提出一种基于滑动窗口的多标记传播聚类算法。首先根据图像距离计算图像间的相似度,设定阈值将相似度转变为链接,构造出一个无向图;然后应用基于滑动窗口的多标记传播算法对无向图进行社区划分。滑动窗口可以存放多个标记,从而一个图像可以归属于多个类别。对公开网络数据和搜索引擎返回的真实图像数据进行实验,结果表明,该方法能有效发现具有重叠划分的簇,且簇的意义比较明确。
To resolve the problems of large-scale data and partial overlapping in image clustering, a novel sliding window based multiple-label propagation clustering algorithm is proposed. An undirected graph is constructed in which the vertex is denoted by the image and the edge represents the relation between images weighted by the similarity computed according to the image distance. Then, community detection is performed by a multiple-label propagation based sliding window. Because a sliding window can store multiple labels, each image may obtain one or more labels. Experiments carried out on public net-works and images returned by search engines show that our method can find explicit clusters with partial overlapping.