在过去的几年,将图像内容表示为特定“视觉词”出现次数直方图的Bag—of-words模型,展示了其在图像内容分类方面的强大优势.然而,在这种统计特定“视觉词”出现次数直方图的模型中,“视觉词”之间的相互位置关系几乎被完全丢弃了.本文从分析Bag-of-words模型在文本分类和图像内容分类领域的对应关系的角度出发,提出一种加入“视觉词”之间的相互位置关系的图像表示方法-Bag-of-phrases模型.在标准数据集上验证了该图像表示方法对图像内容分类性能的影响.实验结果显示,本文提出的方法相对于传统的Bag-of-words模型可以达到更好的分类性能.
Bag-of-words representation, with which an image is represented as a histogram of the numbers of occurrences of particular visual words, has demonstrated impressive levels of performance in the past few years. However, the relative position information between the visual words are almost entirely ignored. In this paper, the potential strength of this relative position information is investigated and a new kind of representation named Bag-of-phrases is proposed. The effectiveness of this strategy is validated on two benchmark databases. The classification results demonstrate that our Bag-of-phrases strategy can achieve better results compared to Bag-of-words method.