针对越南语特点,提出一种基于条件随机场模型的越语命名实体识别方法。该方法针对越语词和词性的特点,采用条件随机场算法,选取词和词性作为特征,定义特征模版,选取越南语新闻文本,标记地名、人名、组织机构等6类实体语料,训练获得越南语实体识别模型,实现实体识别。实验结果表明该方法提取实体的准确率达到83.73%。
A method of named entity recognition is proposed based on conditional random fields model aimed at the lan- guage feature of Vietnamese. This method aims at the feature of word and part of speech, adopts the arithmetic of con- ditional random fields, selects the word and part of speech as the feature, defines the feature template, chooses the news text of Vietnamese, tags the six entity linguistic data such as place name, person name and organization, trains the Viet- namese entity recognition model which acquired. Vietnamese entity recognition experiment results prove that the entity recognition accuracy rate of this method reach 83.73%.