藏语特征提取算法是藏语语音识别系统中最为关键的一个环节。文章在分析藏语发音特点的基础上,建立了基于模拟人耳听觉系统的Mel倒谱系数(MFCC)特征提取算法,然后通过LDA信息压缩算法,对提取的特征数据进行压缩,在降低维数的同时提高了识别率和运算效率,总结出了符合藏语语音特点的LDA-MFCC特征提取算法。
The algorithm of Tibetan speech feature extraction is most critical aspect of Tibetan speech recognition system. In this paper, according to Tibetan pronunciation feature, the feature extraction algorithm of MFCC was established based on simulating human auditory system. The extracted feature data was compressed by using information compression algorithm LDA. The algorithms used in this study can achieve the reduction of dimensionality, improvement of recognition rate and computing efficiency. The feature extraction algorithm LDA-MFCC of Tibetan pronunciation was summarized in accordance with the Tibetan characteristics.