视频语义概念检测是跨越“语义鸿沟”,实现基于语义的视频检索的前提。其中,视觉词典法是一种有代表性的方法。针对视觉词典法的两个开放性问题,文章提出了一种基于LSI和软加权的视频语义概念检测方法。首先为了解决视觉单词间的潜在语义关联问题,利用LSI对大规模视觉词典进行降维,得到紧致的语义视觉词典;然后为了克服视觉单词的同义性和多义性问题,采取软加权机制,构造出视觉词汇分布直方图,作为特征向量来代表每幅输入关键帧;最后利用支持向量机建立高层语义的分类模型,完成视频语义概念检测。实验结果表明,新方法较大地提高了视频语义概念检测的精度。
Video semantic concept detection is the prerequisite of bridging the "semantic gap" and realizing semantic-based video retrieval, in which the bag of Visual Words method is a representative method. To resolve two open issues of Bag of Visual Words method, this paper proposes a video semantic concept detection method based on LSI and soft-weighing. Firstly, latent semantic indexing (LSI) is employed to mine the latent semantic relationship of visual words, which conducts dimen- sionality reduction on the large-scale visual vocabulary to obtain the compact semantic visual vocabulary. Then, to overcome the synonymy and polysemy problem of visual word, a soft-weighing scheme is implemented to construct a visual words distribution histogram as the feature vector to represent each input key frame. Finally, support vector machine (SVM) is trained for each semantic concept to accomplish video semantic concept detection. Experiment results show that the novel method greatly improves the video semantic concept detection accuracy.