位置:成果数据库 > 期刊 > 期刊详情页
基于句法结构特征分析及分类技术的答案提取算法
  • 期刊名称:胡宝顺、王大玲、于戈、马婷,基于句法结构特征分析及分类技术的答案提取算法,计算机学报,31(4),6
  • 时间:0
  • 分类:TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]东北大学软件学院计算机科学与技术系,沈阳110004, [2]东北大学信息科学与工程学院计算机软件与理论研究所,沈阳110004
  • 相关基金:国家自然科学基金(60573090)资助.
  • 相关项目:面向新一代搜索引擎的用户动机推演模型的研究
中文摘要:

由于中文自然语言处理的特点和困难以及相应的语言处理基础资源的相对缺乏,使得国外一些成熟技术和研究成果不能直接应用到中文问答系统中.为此,针对中文事实型问答系统,提出一种新的基于句法结构特征分析及分类技术的答案提取算法,该方法将答案提取问题看成是候选答案的分类问题,即将候选答案分类为正确和错误两类.首先,该方法根据与问题类型所对应的候选答案的类型信息,从文本片断中提取出候选答案及其在句子中的简单特征和句法结构特征;然后利用这些特征训练分类器;最后用训练得到的分类器判别候选答案是否为正确答案.针对中文事实性问题,该方法与目前典型的基于模式匹配的中文答案提取算法相比,准确率提升6.2%,MRR提升9.7%.

英文摘要:

Due to the feature and difficulty of Chinese natural language processing and the lack of related resources, some foreign mature techniques can not be applied in Chinese Question Answering (QA) system. For the Chinese factoid QA system, a new answer extraction method based on syntax structure feature parsing and classification is presented in this paper. With the method, the answer extraction is regarded as candidate answer classification problem,i, e. candidate answers are classified into correct and incorrect answer. According to the part-of-speech information of candidate answers corresponding to question types, the candidate answers and their features (both simple and syntactic) in sentences from snippets are firstly extracted. Then these features are used to train the classifier. Finally, the trained classifier is used to distinguish whether the candidate answer is correct or not. For Chinese factoid questions, comparing to currently typical pattern matching based answer extraction algorithm, the new method improves precision by 6.2% and MRR by 9.7%.

同期刊论文项目
同项目期刊论文