东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于规则的中文阅读理解问题回答技术研究

ISSN号：1003-0077
期刊名称：《中文信息学报》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]山西大学计算中心,山西太原030006, [2]山西大学数学科学学院,山西太原030006, [3]山西大学计算机与信息技术学院,山西太原030006
相关基金：国家自然科学基金资助项目（60873128）;国家社会科学基金青年资助项目（07CYY022）

作者：李济洪[1], 杨杏丽[2], 王瑞波[3], 张娜[2], 李国臣[3]

关键词：计算机应用, 中文信息处理, 阅读理解, 问答系统, 规则, 正交表, computer application, Chinese information processing, reading comprehension, question answering, heuristic rules, orthogonal array

中文摘要：

该文针对中文阅读理解问答中的时间、人物、地点、数值、实体、描述六类问题，制定了各类问题回答的启发式规则集。对规则集中每条规则赋予一个相应权值，利用正交表对各规则所对应的权值进行了调优选取，给出了各候选答案句基于相应规则的得分计算方法。该文方法在山西大学自主开发的中文阅读理解语料库CRCC v1．1上进行了实验，在整个语料库上得到了83．09％的HumSent准确率。为了与文献[10]中的最大熵方法比较，该文在与文[10]中完全相同的训练集上调优规则的权值，在相同的测试集上测试，最终得到HumSent准确率81．13％，比最大熵的方法高大约1％，且在全部的六类问题上，该文方法的HumSent准确率都不低于最大熵方法。

英文摘要：

This paper constructs a set of heuristic rules for six types of question regarding to time, human, location, number, entity and description in Chinese QARC system. Each rule is further assigned with a weight optimized by the orthogonal array. Then the calculation of each candidate answer sentence is described over corresponding rules. The experiment on the CRCC v1. 1 （Chinese reading comprehension corpus） built by Shanxi University produces 83.09% HumSent accuracy. Compare with the results of ME-based method, the proposed approach achieves 81.13% HumSent accuracy, which is about 1% higher than the ME-based results on the same training and testing environment.

同期刊论文项目