开发人员通常通过问答网站的搜索引擎进行相关软件问答文档的搜索。在检索结果中,包含优质代码片段(使用示例)的问答文档往往更受青睐,但如何度量这些文档中代码片段的质量仍是个巨大的挑战。针对这个问题,提出了一种基于代码模式的软件问答文档检索优化方法。该方法能够基于当前检索结果,抽取文档中的代码片段,分析代码片段中的公共代码模式,并基于代码模式度量文档中代码片段的质量,从原有检索结果中向用户推荐高质量的软件问答文档。以软件开发人员在实践过程中遇到的真实问题为基础进行了实验,对比StackOverflow的搜索结果,所提方法在准确率指标NDCG@5上提升了40%。
Developers often need to search related software Q&A documents in Q&A website.In the search results,the Q&A documents which contain good code snippets(usage examples)are preferred.However,how to metric those code snippets in document is still a big challenge.To address this issue,this paper proposes an approach for refining software Q&A document search results based on code pattern.Firstly,code snippets are extracted from each document in the search results.Then,the common code patterns are mined and used to measure the quality of those code snippets.Finally,the documents with high quality are recommended and ranked at the top of the search results.In the experiments,this paper carries out some evaluations with10real problems that software developers meet in practice.Compared to the search results of StackOverflow,the proposed approach has an increment of40%at NDCG@5.