随着互联网技术的飞速发展,视频数据呈现海量爆炸式增长,传统的视频搜索引擎多数采用单一的基于文本的检索方法,该检索方法对于视频这类非结构化数据,存在着内容缺失、语义隔阂等问题,导致检索结果相关度较低。提出一种基于视觉词袋的视频检索校准方法,该方法结合了视频数据的可视化特征提取技术、TF-IDF技术、开放数据技术,为用户提供优化后的视频检索校准结果。首先,基于HSV模型的聚类算法提取视频的关键帧集合及关键帧权值向量;接着用关键帧图像的加速稳健特征等表示视频的内容特征,解决视频检索的内容缺失问题;然后利用TF-IDF技术衡量查询语句关键字的权值,并开放数据获得查询语句关键字的可视化特征和语义信息,解决视频检索的语义隔阂问题;最后,将提出的基于视觉词袋的视频检索校准算法应用于Internet Archive数据集。实验结果表明,与传统的基于文本的视频检索方法相比,该方法的平均检索结果相关度提高了15%。
With the rapid development of Internet technology, the number of videos is proliferating in an exploding way. The traditional text-based retrieval method may bring problems of content missing and semantic gap, and result in lower retrieval relevance score. Therefore, a video retrieval calibration method is proposed, which is based on bag of visual words and combines the visual features extraction technology, TF-IDF technology and the open data technology. First, the HSV-based clustering algorithm is used to extract the video key frames and the weight vector. Second, speed up robust features and some other visual features are used to resolve video content missing problem. Third, the TF-IDF technology is used to measure the weight of keyword, and the open data technology to obtain the visual features and semantic of the query word, then solve the semantic gap problem. Finally, our video retrieval calibration algorithm is applied on the Internet Archive data set. The result shows that compared with traditional text-based video retrieval method, our method has a 15% relative improvement on the average retrieval relevance score.