本文提出了一种基于区分性准则的模型结构优化方法,用以调整HMM自动语音识别系统中声学模型各状态混合高斯核成分数量的分配。通过优化选定的准则,声学模型可以在使用相同参数数量的情况下得到更好的识别性能,也可以在保持相当性能的前提下降低所需要的模型参数。相对于传统的基于似然度及复杂度惩罚的模型结构优化准则来讲,基于区分性准则的优化方法能够更直接地提高模型的区分度和鉴别力,从而得到更好的识别效果。在一个面向嵌入式系统的中文连续数字串识别任务上的实验结果证明,基于最大互信息量准则的模型结构优化能够得到比传统的、基于模型似然度及复杂度的方法更好的识别效果。
This paper presents a model topology optimization method based on discriminative criterion, to adjust the kernel number of Mixture Gaussian in Hidden Markov Model (HMM) states for automatic speech recognition. With the optimized selection criterion, we can get better recognition performance with the same amount of parameters in acoustic model, or can reduce the amount of parameters while maintaining a comparable performance. Compared with conventional methods which are based on likelihood or complexity reduction algorithms, the method proposed in the paper can improve the discrimination of acoustic model more directly, and can improve the final recognition performance. Experimental results based on a continuous Chinese digital string corpus which is designed for embed- ded systems also prove that, the model topology optimization method based on discriminative criterion achieves a better performance compared with conventional, likelihood and complexity based criteria.