计算复杂度高导致循环神经网络语言模型训练效率很低,是影响实际应用的一个瓶颈.针对这个问题,提出一种基于批处理(mini-batch)的并行优化训练算法.该算法利用GPU的强大计算能力来提高网络训练时的矩阵及向量运算速度,优化后的网络能同时并行处理多个数据流即训练多个句子样本,加速训练过程.实验表明,优化算法有效提升了RNN语言模型训练速率,且模型性能下降极少,并在实际汉语语音识别系统中得到了验证.
High computational complexity leads to low efficiency in training a recurrent neural network (RNN) language model. This becomes a major bottleneck in practical ap- plications. To deal with this problem, this paper proposes a parallel optimization algorithm to speed up matrix and vector operations by taking the advantage of CPU's computational capability. The optimized network can handle multiple data streams in parallel and train several sentence samples simultaneously so that the training process is significantly acceler- ated. Experimental results show that the model training of RNN is speeded up effectively without noticeable sacrifice of model performance. The algorithm is verified in an actual Chinese speech recognition system.