本文提出了一种基于SDC特征和GMM-UBM模型的自动语种识别方法。SDC特征由许多语音帧的一阶差分谱连接扩展而成,与传统的MFCC特征相比,包含了更多的时序特征信息。UBM模型反映了所有待识别语种的特征分布特性,借助贝叶斯自适应算法可以快速得到每个语种的模型。与传统的GMM方法相比,该方法的训练和识别的速度更快。谊方法对OGI电话语音库中11个语种进行了测试,其10秒、30秒和45秒句子的最佳识别正确率分别为72.38%、82.62%和85.23%,识别速度约为0.03倍实时。
This paper presents an automatic language identification (LID) system which uses shifted delta cepstra (SDC) feature vectors and universal background model (UBM). SDC feature is created by stacking delta cepstra computed across multiple speech frames and is involved with much more temporal information than conventional MFCC feature. UBM represents the characteristic of all different languages and each language model is obtained by employing the Bayesian adaptation from this UBM. Compared with the conventional GMM method, the training and testing speed of this method is much faster. This system performance is evaluated on the OGI corpus. The best identification accuracy for 11-languages is 73.28% for 10-s utterances, 82.62% for 30-s utterances and 85. 23% for 45 s utterances. The processing speed is about 0.03 times real time.