自动图像标注是图像检索任务中重要而具有挑战性的工作。文中首先讨论并解释了自动图像标注问题,通过总结现有的研究工作,提出了一种基于图学习的图像标注框架。在该框架下,图像标注被分为两个阶段来完成,即基本图像标注与图像标注改善。其中,前者是通过以图像间相似性为依据的图学习过程来提供图像的初始标注,而后者是通过以词汇间语义相关性为依据的图学习过程来改善前者取得的标注结果。该框架主要涉及到图像与文本词汇两种媒体的内部和相互之间的各种关系的估计问题。基于此,作者又给出了针对上述各子问题的改进方法,并将它们综合起来实现了有效的图像标注。最后,通过Corel图像集与网络数据集上一系列实验结果,验证了该模型框架及所提出解决方案的有效性。
Image annotation is an important and challenging task in image retrieval. This paper discusses the annotation process theoretically by reviewing some related work, and proposes a unified annotation framework via graph learning. The framework includes two sub-processes, i. e. , basic image annotation and annotation refinement. In the basic annotation process, the image-based graph learning is utilized to obtain the candidate annotations. In the annotation refinement process, the word-based graph learning is used to refine those candidate annotations from the prior process. This paper also proposes some improvements on sub-problems involved in the framework and expect their combination to enhance the overall performance. Finally, experiments conducted on the Corel dataset and Web image dataset demonstrate the effectiveness of the unified framework and the proposed improvements.