针对语音文稿已知的情况,提出了一种简单方法实现了适用于在线语音流的字幕自动生成系统。主要思路是根据文稿分句的情况确定在线语音的句子边界,进而,将相应的句子显示到屏幕上。假设在线语音的句子起点已知,本文建立了具有帧同步的统计假设似然比模型检测在线语音的句子尾点,在HMM框架下对该模型进行求解。实验表明,如果以检测到的句子尾点与真正的句子尾点的时间差作为指标,对于干净语音,99.5%左右的时间差在一秒以内,达到了实际要求。最后,本文利用所提出的针对在线语音流的字幕自动生成算法,实现了一个适用于在线新闻广播加字幕场景的演示系统。
This paper proposes an algorithm for system of subtitles automatically generated in case of spoken utterances via lives input with known and accurate transcripts. The main idea is to determine the border of sentence of speech based on the clause of the sen- tences and put the sentence corresponding to the current speech on the screen. We make a frame - synchronous likelihood ratio test model to solve the problem of detection of the end point of sentence and explore the implementation of the FS - LRT within HMM frame- work. Finally, we measure the algorithm objectively in terms of time differences between the end points detected and the ground true end points. Experiment indicates that 99.5% detections are within the range of ls for the clean speech,which is practical. Finally, a demo system of adding subtitles for live broadcast is achieved, based on the algorithm of adding subtitles automatically for live speech.