Bag-of-words representation, with which an image is represented as a histogram of the numbers of occurrences of particular visual words, has demonstrated impressive levels of performance in the past few years. However, the relative position information between the visual words are almost entirely ignored. In this paper, the potential strength of this relative position information is investigated and a new kind of representation named Bag-of-phrases is proposed. The effectiveness of this strategy is validated on two benchmark databases. The classification results demonstrate that our Bag-of-phrases strategy can achieve better results compared to Bag-of-words method.