东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多模态子空间相关性传递的视频语义挖掘

ISSN号：1000-1239
期刊名称：计算机研究与发展
时间：0
页码：305-312
语言：中文
分类：TP391.4[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]浙江大学计算机科学与技术学院,杭州310027
相关基金：国家自然科学基金项目（60603096,60533090）;国家“八六三”高技术研究发展计划重点基金项目（2006AA010107）;长江学者和创新团队发展计划基金项目（IRT0652）
相关项目：跨媒体海量信息的综合检索与智能技术的研究

关键词：视频语义挖掘, 多模态, 语义概念检测, 子空间相关性传递, 时序关联共生特性, video semantics mining multi-modality propagation, temporal associated co-occurrence semantic concept detection, subspace correlation

中文摘要：

在视频语义信息理解和挖掘中,充分利用图像、音频和文本等多模态媒质之间的交互关联是非常重要的研究方向.考虑到视频的多模态和时序关联共生特性,提出了一种基于多模态子空间相关性传递的语义概念检测方法来挖掘视频的语义信息.该方法对所提取视频镜头的多模态底层特征,根据共生数据嵌入（co-occurrence data embedding）和相似度融合（Si mFusion）进行多模态子空间相关性传递而得到镜头之间的相似度关系,接着通过局部不变投影（locality preserving projections）对原始数据进行降维以获得低维语义空间内的坐标,再利用标注信息训练分类模型,从而可对训练集外的测试数据进行语义概念检测,实现视频语义信息挖掘.实验表明该方法有较高的准确率.

英文摘要：

Research on content-based multimedia retrieval is motivated by a growing amount of digital multimedia content in which video data is a big part. Interaction and integration of multi-modality media types such as visual, audio and textual data in video are the essence of video content analysis. Although any uni-modality type partially expresses limited semantics less or more, video semantics are fully manifested only by interaction and integration of any unimodal. Video data comprises plentiful semantics, such as people, scene, object, event and story, etc. A great deal of research has been focused on utilizing multi-modality features for better understanding of video semantics. Proposed in this paper is a new approach to detect semantic concepts in video using co-occurrence data embedding （CODE）, SimFusion, and locality preserving projections （LPP） from temporal associated cooccurring multimodal media data in video. The authors＇ experiments show that by employing these key techniques, the performance of video semantic concept detection can be improved and better video semantics mining results can be obtained.

同期刊论文项目