提出一种基于P-集合和形式概念分析的中文领域本体学习方法.该方法以非结构化中文文本为数据源,通过引入P-集合理论获取形式背景,在获取形式背景的基础上,采用Godin算法构造概念格,并采用自定义映射规则实现概念格到中文领域本体的映射.通过学习生物和水领域文本,得到一个中文领域本体.实验结果表明,该方法能解决手工构建本体费时、费力的问题,且学习到的本体是形式化本体,能被更好地共享和重用.
We provided a Chinese domain ontology learning method based on P-sets and formal concept analysis which learns from unstructured Chinese texts,extracts the formal background with the P-sets.Then,it uses the Godin algorithm to construct a concept lattice based on the formal background and transforms the concept lattice into the OWL Chinese ontology with the mapping rules that we have defined.At last,we learned from the domain about biology and water via the method we proposed and got the Chinese domain ontology.The results show that the method can solve the problems of time-consuming and energy-wasting.In addition,the ontology we got is a formal ontology,so it has more advantages in sharing and reusing.