东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

群延时谱参数在汉语数字语音识别中的应用

ISSN号：1003-0530
期刊名称：《信号处理》
时间：0
分类：TN912.34[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：苏州大学电子信息学院,江苏苏州215006
相关基金：国家自然科学基金（61271360）;江苏省自然科学基金（BK20131196）

作者：周峰, 俞一彪

关键词：数字识别, 群延时, 多级识别, digit recognition, group delay, muhi-level recognition

中文摘要：

汉语数字语音之间的高混淆性直接影响了汉语数字语音识别的效果，传统的语音识别方法很难对易混淆的语音做出有效的区分。本文提出了一种多参数、多级识别策略，先采用MEL谱参数基于HMM进行初级数字语音识别，然后对易混淆的数字对采用一种新的群延时谱参数——RRCGD-CC（Reflected Roots Chirp Group Delay．Cepstral Coefficients）基于SVM进行二次分类。实验结果表明，通过多参数多级识别方法，数字“2”和“8”的识别率提高了8％，数字识别系统的整体识别率提高了2．3％。这一结果充分说明了本文提出的多参数多级识别方法有利于提高汉语数字语音识别系统的识别性能，同时也说明了RRCGD·CC在易混淆数字语音的识别上是有效的。

英文摘要：

The high confusion between Chinese digits directly affects the performance of Chinese digit speech recognition. Traditional methods are difficult to make an effective distinction between easy-confused digits. This paper presents a multiparameter and multi-level recognition strategy. Firstly the digits are recognized by Mel spectral parameters based on HMM, then take secondary classification for the easy-confused digits using RRCGD-CC （ Reflected Roots Chirp Group Delay-Cepstral Coefficients）, which is a new parameter based on group delay spectrum, and SVM. The experimental results show that the recognition rate of ＂2＂ and ＂8＂ is improved by 8%, and the recognition rate of the system is improved by 2. 3%. This result is fully explained that the RRCGD-CC is valid for easily confused digits.

同期刊论文项目