wikjpedia作为一个大型的知识库,正逐渐被人们应用于不同的领域。在本体构建领域,wikipedia以其丰富的组织结构为大规模协作式的本体学习提供了有效的环境,利用wikipedja进行本体学习逐渐成为一个新的研究热点。本文从这一角度出发,在剖析Wikipedia基本结构的基础上,分析和比较了从类别结构图、信息盒和定义句中获取本体概念和实例的相关原理和方法,阐述了利用wikjpedia获取本体关系的原理,分析了基于结构特征、基于词典、基于句法和基于混合方式的模式匹配方法,以及基于结构特征和文本特征的统计学习方法,并对利用各种方法获取本体关系的效果进行了比较。
Wikipedia is a huge knowledge base which is being applied to a lot of tasks. From an ontology-building perspective, utilizing wikipedia as a source for ontology learning is becoming a new research focus because the cyclopedic nature of Wikipedia provide an effective environment for large-scale and collaborative ontology learning. From this point of view, the paper first describes the structure of Wikipedia, then analyzes and compares the principles and methods of acquiring concepts and instances by utilizing category structure graph, information box and definition sentence. The paper analyzes the principles of relation extraction from Wikipedia, and compares the efficiency of pattern matching method and statistics learning method.