传统词袋模型仅仅是将图像表示成视觉单词的直方图,并没有考虑到物体的形状信息,也没有考虑到视觉特征的空间信息。因此将金字塔模型引入到词袋模型中,建立金字塔词袋模型,将金字塔词袋模型与金字塔直方图模型相结合,两种信息相互补充,共同来表征图像;在分类器设计方面采用SVM进行分类。通过在Caltech 101数据库进行实验,验证了论文方法的有效性,实验结果表明,该方法能够大幅度提高图像分类的性能。
Conventional bag of words model just uses an histogram of visual word to represent the image, and does not take into account the shape information of the object, the visual characteristics of spatial information. As two complementary features, PHOG(Pyramid Histogram of Orientated Gradients) and PHOW(Pyramid Histogram Of Words) are adopted to extract and describe the features of images. The experiments are carried out on Caltech 101 database. The results show that the proposed method performs better than the traditional method.