针对现有的中文客户评论产品属性识别方法存在的不足,通过采用词法分析、句法分析、同义词词林等多项技术和资源,挖掘真实语料中蕴藏的语言知识,提出了一种基于模板的产品属性识别方法。该方法对评论语料进行词法、句法分析和人工标注,从标注结果中综合分析和归纳评论句的全局语言规则,提取属性词和评价词之间的词性和依存关系序列,借助同义词词林构建产品属性模板,使用属性模板识别产品属性。对比实验结果表明了提出方法的有效性。
In view of the shortcomings of the existing methods which extracts product features from Chinese customer reviews, a template-based method of product feature extraction is proposed by using diverse techniques and resources such as lexical analy- sis, syntactic analysis and Tongyici Cilin to mine potential language knowledge in real corpora. Firstly this method conducts lexi- cal analysis, syntactic analysis and artificial labelling to the reviews corpora, and then analyzes and sums up the global language rules from the marked results, and extracts the sequences of the POS and dependency parsing tags from the feature word and the opinion word, furthermore constructs the product feature templates by using Tongyici Cilin resource, finally the product features stated in consumer reviews are identified by using the feature templates. Experimental results show that the proposed method is effective.