近年来,由于动态贝叶斯网络(DBN)相对于传统的隐马尔可夫模型(HMM)更具可解释性、可分解性以及可扩展性,基于DBN的语音识别引起学者们越来越多的关注.但是,目前关于基于DBN的语音识别的研究主要集中在孤立语音识别上,连续语音识别的框架和识别算法还远没有HMM成熟和灵活.为了解决基于DBN的连续语音识别的灵活性和可扩展性,将在基于HMM的连续语音识别中很好地解决了上述问题的Token传递模型加以修改,使之适用于DBN.在该模型基础上,为基于DBN的连续语音识别提出了一个基本框架,并在此框架下提出了一个新的独立于上层语言模型的识别算法.还介绍了作者开发的一套基于该框架的可用于连续语音识别及其他时序系统的工具包DTK.
Recently, dynamic Bayesian network (DBN) based speech recognition has aroused an increasing interest, because of its interpretability, factorization and extensibility, which hidden Markov models (HMMs) lack. Although a huge success of the introduction of DBNs into speech recognition in many areas and DBNs has been presented with promising potential to overcome inherent limitations of HMMs in speech recognition, previous work on DBN based speech recognition mainly focuses on isolated word speech recognition, and the frameworks and recognition algorithms for DBN based continuous speech recognition are not as mature and flexible as those for HMM based one. This paper is trying to address the problems of flexibility and extensibility in DBN based continuous speech recognition. To achieve this purpose, the token passing model, which works very well to address the above problems for HMM based continuous speech recognition, is adapted for DBN based one, and a general framework based on it is proposed. In this framework, the advantages of both token passing model and DBN are combined. A novel recognition algorithm independent of the upper layer language model is proposed under this framework, and a toolkit DTK for building DBN based speech recognition under this framework is developed.