药名识别的直接目的是从生物医学文本中寻找药名.目前,药物相关研究不断出现,远远超出了维护人员更新药物信息数据库的速度,这就迫切需要一种自动提取药物信息的技术.该文采用了一种基于特征耦合泛化(FCG)的半监督学习方法生成药名词典,然后将药名词典和条件随机场结合进行药名实体识别.首先我们用模板的方法构造了一个药名词典,然后用FCG方法对词典去噪,最后将去噪后的词典用在测试集上进行药名实体识别,得到了76.73%的F值.
Drug name recognition aims to find drugs in biomedical texts, which is a demanding technology in face of overwhelming drug researches. We adopt a semi-supervised learning method to build a dictionary and then use the combination of the dictionary and the Condition Random Field method to recognize the drug name entities. Firstly, we extract a drug name dictionary using template matching method and then Feature Coupling Generalization (FCG) is used to filter the dictionary. Finally, we combine the dictionary and the Condition Random Field method to recog- nize the drug entities. As a result, our method achieved an F-score of 0. 767 3 on the drug name recognition corpus.