由于中文和英文在语法和句法等方面的差异,面向中文文本的本体学习方法尚存在一定困难。研究了面向中文文本的玉米病虫害本体学习方法。提出单字合并法,将其与TFIDF方法结合,进行概念抽取;将欧几里德距离与余弦距离加权平均计算概念相似度,进行概念关系抽取。从中国玉米网选取50篇领域文档,应用上述方法构建了玉米病虫害本体。
As Chinese and English are different in grammar and syntax,Chinese text-oriented ontology learning is very difficult.The paper studies the method of learning ontology of maize pests and diseases from Chinese text.Character combining method is proposed and combined with TFIDF to extract concept.The similarity of concepts is measured by Euclidean distanee and cosine distance-weighted average in extracting concept relations.Ontology of maize pcsts and diseases arc learned from fifty domain documents of China Maize Network using above methods.