针对传统高斯混合模型在建模过程中只采用倒谱系数表示的语音谱特征,而忽略说话人基音频率信息的问题,提出了一种基于多空间概率分布的基音融合高斯混合模型.该模型在每个高斯成分空间中对浊音和清音进行选择性区分,并将基音与倒谱特征参数进行融合.实验结果表明,通过对模型参数进行重估计,在TIMIT、NTIMIT两种不同语料库情况下,该模型的识别率较两种不同的基线系统均有提高.
The traditional Gaussian mixture model in modeling process uses only spectral features represented by cepstral coefficients while ignoring the speaker's pitch features information.To solve this problem,a new Gaussian mixture model with pitch features-integration is proposed based on multi-space probability distribution.Through the selective distinction between voiced and unvoiced parts in every Gaussian component space,the cepstral and pitch features can be jointly modeled in an unified framework.By re-estimation of the proposed model parameters,the experimental result shows the recognition rate of the proposed model is better than that of two different baseline systems on two different speech databases,namely TIMIT and NTIMIT.