利用语义、语法等语言知识,对中心词驱动的句法分析模型规则进行分解和修改,结合分词、词性标注进行句法分析,提出一种可同时考虑多个语义依存关系的模型。利用互信息给出基于邻接关系、语义依存关系的2种词相似度定义,提出一种自下而上的分层聚类算法,以解决中心词驱动模型数据稀疏问题,用改进的句法分析模型进行句法分析实验。研究结果表明:模型精确率和召回率分别为88.14%和86.93%,综合指标比Collins头驱动句法分析模型的综合指标提高6.09%。
By incorporating linguistic features such as semantic dependency and syntactic relations,a novel statistical Parsing model was proposed.The model was constructed on cluster,and the problem of data sparseness was not serious.The model took advantage of a few semantic dependencies at the same time,and it was a parser based on lexicalized model.Experiments were conducted for the refined statistical parser.The results show that precision and recall are 88.14% and 86.93%,respectively,and comprehensive factor is improved by 6.09% compared with that of the head-driven parsing model.