视频中文字的提取是视频语义理解和检索的重要信息来源.针对视频中的静止文字时间和空间上的冗余特性,以文字区域的边缘位图为特征对检测结果作精化,并提出了基于二分搜索法的快速文字跟踪算法,实现了对文字对象快速有效的定位.在分割阶段,除了采用传统的灰度融合图像进行文字区域增强方法,还结合边缘位图对文字区域进行进一步的背景过滤.实验表明,文字的检测精度和分割质量都有很大提高.
Superimposed texts bring important semantic clues for video indexing and retrieval. Texts in videos often span tens or even hundreds of frames and many researchers have exploited the temporal redundancy of video text to improve the text detection accuracy and the text region quality. Described in this paper is a novel approach to track and segment static superimposed texts by utilizing multiple video frame information. For text detection, multiple frames are used to verify the appearance of the text regions which have been detected on a single frame. A binary-search based text tracking method is proposed, which can track the static text object efficiently by utilizing the features of the edge bit map. In order to refine the text regions, text detection is performed again on a synthesized image, which is produced by minimum/ maximum pixel search on consecutive tracked frames. In text segmentation, edge features are exploited to further remove complex background in addition to traditional gray-value integration. Experimental results show the effectiveness of the proposed method.