图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路.基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像.该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射.使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性.实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性.
A key issue of semantic-based image retrieval is how to bridge the semantic gap between the low-level feature of image and high-level semantics,which can be expressed by means of free text effectively.The cross-modal relationship between the text and image is studied by a modeling semantic correlation between text and image.Based on the model,an approach to image retrieval is proposed so that images are retrieved according to meaning of the query text rather than query Key words.First,an algorithm for solving sparse canonical correlation analysis(CCA) is designed in this paper.Then a semantic space is learned by way of latent semantic analysis from text corpus,and images are represented by bag of visual words.After that,a semantic correlation space,by which the map between visual words of image and the high-level semantics is made explicit,can be constructed.The proposed method solves CCA in a sparse framework in order to make the result more interpretable and stable.The experimental result demonstrates that Sparse CCA outperform CCA in the context,and also substantiates the feasibility of the proposed approach to image retrieval.