针对不同模态的多媒体数据之间难以度量跨媒体相关性的问题,提出了一种基于相关性推理的跨媒体检索方法,首先从相同模态内部(intra-media)的相似性和不同模态之间(cross-media)的相关性两个方面进行分析和量化,然后构造跨媒体关联图将相似性和相关性学习结果进行统一表达,以跨媒体关联图的最短路径为基础进行跨媒体检索,并提出相关反馈算法将用户交互中的先验知识融入到跨媒体关联图中,有效提高了跨媒体检索效率.该方法可以应用于针对用户提交查询样例的不同模态交叉检索系统.
A cross-media retrieval approach is proposed to solve the problem of cross-media correlation measuring between different modalities, such as image and audio data. First both intra- and cross- media correlations among multi-modality datasets are explored. Intra-media correlation measures the similarity between multimedia data of the same modality, and cross-media correlation measures how similar in semantic level two multimedia objects of different modalities are. Cross-media correlation is very difficult to measure because of the heterogeneity in low-level features. For example, images are represented with visual feature vectors and audio clips are represented with heterogeneous auditory feature vectors. Intra-media correlation is calculated based on geodesic distance, and cross-media correlation is estimated according to link information among WebPages. Then both kinds of correlations are formalized in a cross-media correlation graph. Based on this graph cross-media retrieval is enabled by the weight of the shortest path. A unique relevance feedback technique is developed to update the knowledge of multimodal correlations by learning from user behaviors, and to enhance the retrieval performance in a progressive manner. This approach breakthroughs the limitation of modality during retrieval process, and is applicable for query-by-example and cross- retrieval multimedia applications. Experiment results on image-audio dataset are encouraging, and show that the performance of the approach is effective.