针对目前词袋模型(Bo W)视频语义概念检测方法中的量化误差问题,为了更有效地自动提取视频的底层特征,提出一种基于拓扑独立成分分析(TICA)和高斯混合模型(GMM)的视频语义概念检测算法。首先,通过TICA算法进行视频片段的特征提取,该特征提取算法能够学习到视频片段复杂不变性特征;其次利用GMM方法对视频视觉特征进行建模,描述视频特征的分布情况;最后构造视频片段的GMM超向量,采用支持向量机(SVM)进行视频语义概念检测。GMM是Bo W概率框架下的拓展,能够减少量化误差,具有良好的鲁棒性。在TRECVID 2012和OV两个视频库上,将所提方法与传统的Bo W、SIFT-GMM方法进行了对比实验,结果表明,基于TICA和GMM的视频语义概念检测方法能够提高视频语义概念检测的准确率。
To reduce quantization error in vector quantization of Bag of Words( Bo W) for video semantic detection and extract feature automatically and effectively,a new video semantic detection method based on Topographic Independent Component Analysis( TICA) and Gaussian Mixture Model( GMM) was proposed. Firstly,features of each video clip were extracted by TICA algorithm to learn complex invariant features from video clips. Secondly,the feature distribution of each video clip was described by GMM. Finally,a GMM supervector was created from GMM parameters and the GMM supervector for each shot was used as the input of an Support Vector Machine( SVM) for video semantic detection. A GMM can be regard as an extension of the Bo W to a probabilistic framework,and thus,has less quantization error,better retaining the information in the original feature vectors. The experiments were conducted on the TRECVID 2012 and OV datasets. The experimental results show that compared with Bo W and SIFT( Scale Invariant Feature Transform)-GMM algorithm,the proposed method can improve the mean average precision on both of the TRECVID 2012 and OV datasets for video semantic detection.