冗余现象是口语对话中普遍存在的特殊语言现象之一,它的存在常常会影响口语句子的理解和翻译。该文基于真实口语对话语料对冗余现象进行了分析,并在词汇层面对冗余现象进行了分类,然后对口语中的冗余词汇进行了统计识别方法研究。通过对冗余词汇处理前后的口语句子翻译实验,结果表明,预先对冗余现象进行处理,能够改善口语翻译的译文质量。
Fillers and redundancy are the most common phenomena in spoken dialogs.It always influences the results of spoken language understanding and translation system.Based on the analysis and statistical classification of fillers in lexical level of spoken dialog corpus,we propose statistical methods to recognize the fillers.Experiments on translation of the spoken sentences before and after processing of the fillers have been conducted.The experimental results have shown that the performance of spoken language translation system is significantly improved if the fillers are processed before translating.