针对目前连续语音识别解码过程中剪枝阈值的确定不能兼顾解码速度与精度的不足,文中提出一种多维剪枝阈值参数联合优化算法.该算法主要研究全局阈值、词尾阈值、激活模型数、令牌数四维剪枝阈值参数的优化,其优化的主要过程是首先应用多目标优化理论对这四维阈值参数进行联合优化,然后根据优化结果采用分段动态阈值的方法进行后处理.实验结果表明,采用该方法优化后的阈值参数进行一遍解码,解码器的剪枝性能得到明显改善,搜索空间的大小得到有效控制,达到预期的速度与精度权衡的优化效果.
As the current pruning thresholds can not take decoding speed and accuracy into account at the same time in continuous speech recognition,a joint optimization algorithm of multi-dimension pruning thresholds parameters is proposed.The pruning thresholds,including the main beam pruning,the word end pruning,the number of active modes and the tokens,are mainly studied in the proposed algorithm.The multi-objectives theory is adopted to optimize these parameters jointly.And then the strategy of segment-based dynamic thresholds pruning is introduced to deal with the results.The experimental results show that the performance of decoder is improved,the search space of decoding gets effective control,and the request of speed and accuracy can be satisfied.