针对已有领域本体构建算法的不足,提出了一种基于图的面向知识与信息管理的领域本体自动构建算法,包括概念抽取和关系提取。将领域文本文档映射为文档概念图,采用基于图上随机游走的词汇加权算法从全局和局部两方面衡量词汇的重要性,利用图顶点聚类算法对词汇进行分类以产生候选概念。提出了基于约束条件下频繁信息子图挖掘的概念间任意关系提取算法,并引入信息函数对子图的信息量进行评价,得到的领域概念和概念间的关系通过本体评价进行评估后,采用OWL-DL描述为领域本体。通过实验验证了本算法的有效性。
Aiming at the shortage of domain ontology construction algorithm,a graph-based approach for automatic construction of domain ontology oriented to knowledge and information management was proposed in which concept extraction and relationship extraction were included.Each document in the collection was mapped as a document graph.Random walk term weighting was employed to estimate the importance of the term information to the corpus from both local and global perspectives.Graph vertex clustering algorithm was used to classify terms with different meanings and group similar terms to generate candidate concepts.An improved frequent sub-graph mining algorithm constrained by both vertices and information was proposed to find arbitrary latent relationships among these concepts.For ontology evaluation purpose,a method for adaptive adjustment of concepts and relationship with respect to its practical effectiveness was brought forward.Finally the domain ontology was formed by describing domain concepts and relationships using OWL-DL.Evaluation experiments showed the effectiveness of this algorithm.