提出了融合视音频特征的影片摘要生成算法。以特写人脸检测,紧张、激烈镜头检测作为选取重要视频片段的依据,针对影片语音端点难以检测的问题,利用影片字幕文件提取影片语音首尾时间以及语音内容,从而实现了影片语音端点的准确检测。实验证明,该方法生成的影片摘要具有较好的有效性。
Abstract : This paper proposed a movie summarization method based on audio-visual feature fusion. This method selected video clips which had close-up faces and atwitter shots as the important clips. To solve the problem of difficulty in movie speech endpoint detection, it introduced a method of speech endpoint detection based on subtitles of motion pictures, which could detect the start and end time of captions exactly. Compared with other summarization algorithms, the proposed method is proved to be effective for movie summarization.