东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于深层神经网络的多特征关联声学建模方法

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP391.4[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术] TN912.3[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：解放军信息工程大学信息系统工程学院,郑州450002
相关基金：国家自然科学基金项目（61175017,61403415,61302107）

关键词：语音识别, 深层神经网络, 声学模型, 低秩矩阵分解, 融合, speech recognition, deep neural network（DNN）, acoustic models, low-rank matrix factorization, fusion

中文摘要：

针对不同声学特征之间的信息互补性以及声学建模中各任务间的关联性,提出了一种多特征关联的深层神经网络声学建模方法,该方法首先借鉴深层神经网络（deep neural network,DNN）多模态以及多任务学习思想,通过共享DNN部分隐含层为不同特征声学模型间建立关联,从而挖掘不同学习任务间隐含的共同解释性因素,实现知识迁移以及性能的相互促进;其次利用低秩矩阵分解方法减少模型估计参数的数量,加快模型训练速度,并对不同特征的识别结果采用ROVER（recognizer output voting error reduction）融合算法进行融合,进一步提高系统识别性能.基于TIMIT的连续语音识别实验表明,采用关联声学建模方法,不同特征的识别性能均要优于独立建模时的识别性能.在音素错误率（phone error rates,PER）指标上,关联声学建模下的ROVER融合结果要比独立建模下的ROVER融合结果相对降低约4.6%.

英文摘要：

In view of the complementary information and the relevance when training acoustic modes of different acoustic features,ajoint acoustic modeling method of multi-features based on deep neural networks is proposed.In this method,similar to DNN multimodal and multitask learning,part of the DNN hidden layers are shared to make the association among the DNN acoustic models built with different features.Through training the acoustic models together,the common hidden explanatory factors are exploited among different learning tasks.Such exploitation allows the possibility of knowledge transferring across different learning tasks. Moreover,the number of the model parameters is decreased by using the low-rank matrix factorization method to reduce the training time.Lastly,the recognition results from different acoustic features are combined by using recognizer output voting error reduction（ROVER）algorithm to further improve the performance.Experimental results of continuous speech recognition on TIMIT database show that the joint acoustic modeling method performs better than modeling independently with different features.In terms of phone error rates（PER）,the result combined by ROVER based on the joint acoustic models yields a relative gain of 4.6% over the result based on the independent acoustic models.

同期刊论文项目