东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于贪婪树的外部支持向量机近似重复图像聚类算法

ISSN号：1003-0530
期刊名称：《信号处理》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]解放军信息工程大学信息工程学院,河南郑州450002, [2]73615部队,江苏南京210049
相关基金：国家自然科学基金资助项目（60872142）

关键词：聚类, 贪婪树, 支持向量机, 概率潜在语义分析, Clustering, greedy tree, support vector machine, probabilistic latent semantic analysis

中文摘要：

准确地检测出近似重复图像对于冗余去除和版权侵犯检测具有重要的意义。为了改善基于均匀分裂外部支持向量机聚类算法的性能,提出了一种结合贪婪树和外部支持向量机的近似重复图像聚类算法。该方法先利用外部支持向量机将数据集聚为两类,然后采用贪婪树生长算法选择＂最优＂的类进行分解,重复上述过程直到不可分为止。此外,为了克服图像视觉单词的同义性问题,利用概率潜在语义分析模型将同现的图像视觉单词映射到潜在语义空间中的同一方向上。实验结果表明,与内部支持向量聚类算法和基于均匀分裂的外部支持向量机聚类算法相比,该方法在聚类性能方面有了明显的提高。

英文摘要：

Detecting near-duplicate images accurately is very important for redundancy removal and copyright infringement detection.To improve the performance of Uniform Splitting based Support Vector Machine External Clustering（US-SVMEC）,an near-duplicate image clustering algorithm which combines Greedy Tree with SVMEC（GT-SVMEC） is proposed in this paper.Firstly,SVMEC is applied to cluster the dataset into two clusters.Then,greedy tree growing algorithm is used to choose the ＂best＂ cluster to split.Repeat above procedure until no improvement can be achieved.In addition,to overcome the problem of visual word synonymy,Probabilistic Latent Semantic Analysis（PLSA） model is adopted to map the co-occurring image visual words to the same direction in the latent semantic space.Experimental results show that compared with SVM-Internal Clustering（SVMIC） and US-SVMEC,our proposed approach improves the clustering performance obviously.

同期刊论文项目