为发现针对新闻事件中实体展开的网络评论,提出一种基于条件随机场的网络评论与新闻事件中命名实体匹配方法,使用semi-Markov CRFs从评论语句中识别出片段粒度的命名实体;针对评论描述随意的特点,结合命名实体的模式特征、符号特征等识别出评论中实体的简称、缩写、昵称等变体形式。使用linear-chain CRFs结合多种匹配方法计算评论中命名实体与事件中命名实体的综合相似度,完成匹配。实验证明,提出的基于条件随机场的网络评论与事件中命名实体匹配方法能够准确地根据命名实体匹配评论与事件。
In order to detect network reviews to given named entities of news events,this paper proposed a network reviews and named entities of events matching method based on conditional random fields. It used semi-Markov CRFs to label reviews and recognized named entities in segments length. According to the character of reviews free description,the method used the pattern feature,symbol feature and other feature of named entity to recognize some varietal forms,such as abbreviation and acronym. It used linear-chain CRFs and comprehensive similarity measurement to detect the similarity of named entity of reviews and events. The experiment results demonstrate that the proposed method's matching accuracy is better than other methods'.