面向对话生成问题,提出一种构建对话生成模型的方法——基于分层编码的深度增强学习对话模型(EHRED),用以解决当前标准序列到序列(seq2seq)结构采用最大似然函数作为目标函数所带来的易生成通用回答的问题。该方法结合了分层编码和增强学习技术,利用分层编码来对多轮对话进行建模,在标准seq2seq的基础上新增了中间层来加强对历史对话语句的记忆,而后采用了语言模型来构建奖励函数,进而用增强学习中的策略梯度方法代替原有的最大似然损失函数进行训练。实验结果表明EHRED能生成语义信息更丰富的回答,在标准的人工测评中,其效果优于当前广泛采用的标准seq2seq循环神经网络(RNN)模型5.7~11.1个百分点。
Aiming at dialog generation problem, a dialog generation model based on hierarchical encoding and deep reinforcement learning, namely Enhanced Hierarchical Recurrent Encoder-Decoder (EHRED) was proposed to solve the problem that standard sequence to sequence (seq2seq) architectures are more likely to raise highly generic responses due to the Maximum Likelihood Estimate (MLE) loss function. A multi-round dialog model was built by hierarchical structure, and a hierarchical layer was added to enhance the memory of history dialog based on the standard seq2seq architecture, and then a language model was used to build reward function, replacing traditional MLE loss function with policy gradient method in deep reinforcement learning for training. Experimental results show that EHRED can generate responses with richer semantic information and improve by 5.7 - 11. 1 percentage points in standard manual evaluation compared with the widely used traditional standard seq2seq Recurrent Neural Network (RNN) dialog generation model.