该文研究了一种基于多特征表示的本体概念挂载方法。以中国大百科知识体系作为本体体系结构,抽取网络知识库条目作为本体概念,通过分析条目中文本内容、语义标签和半结构化信息获得本体概念间层级关系。该文将中国大百科知识体系扩展为百万级概念的多领域中文本体,为进一步抽取本体概念的属性、概念之间的非层级关系以及支持问答服务等应用建立了良好的基础。实验证明该方法相对于单一特征方法能够提高11.8%的挂载精度。
This paper proposes an ontology concept acquisition method based on heterogeneous features.We regard the Encyclopedia of China as the taxonomy of ontology,extract Web knowledge base articles as concepts and learn taxonomic relations between concepts by considering the text content,folksonomies as well as semi-structured information.We extend the Encyclopedia of China to a mega-scale global Chinese ontology which provide practical support for concept attributes extraction,non-taxonomic relations extraction and other applications such as Question Answering System.Experimental results show that the proposed method achieved 11.8% performance improvement compared to the single feature method.