基于统计机器翻译模型的问句检索模型,其相关性排序机制主要依赖于词项间的翻译概率,然而已有的模型没有很好地控制翻译模型的噪声,使得当前的问句检索模型存在不完善之处.文中提出一种基于主题翻译模型的问句检索模型,从理论上说明,该模型利用主题信息对翻译进行合理的约束,达到控制翻译模型噪声的效果,从而提高问句检索的结果.实验结果表明,文中提出的模型在MAP(Mean Average Precision)、MRR(Mean Reciprocal Rank)以及p@1(precision at position one)等指标上显著优于当前最先进的问句检索模型.
The ranking scheme of the statistical translation based question retrieval models is mainly depended on the translation probabilities between terms. However, the existing translation based models yield on the noise generated by the translation model and further impact the question retrieval results. In this paper, we proposed a topic inference based translation model for question retrieval. By leveraging the topic information, we theoretically verified that it can reasonably control the translation noise and then improves the question retrieval results. Experimental results show that the proposed model significantly outperforms the state-of-the-art question retrieval models in MAP (Mean Average Precision), MRR (Mean Reciprocal Rank) and p@1 (precision at position one).