本文基于术语共现理论,利用形式概念分析中概念格的自动生成来推理作为属性的领域专业术语的层次结构并进行可视化展示,进而提出了一整套用于实现领域本体概念层次关系构建的解决方案,具体包括文档/词汇与术语语义关联的识别、领域形式化背景的建立、基于形式概念分析的主题概念的生成、基于主题概念格的术语层次关系抽取、术语层次体系的OWL描述和图形展示等。笔者以“白血病”领域为例,详细论证了无知识库支持环境下中文文本到医学学科术语层次结构的衍化过程,并对以文档术语矩阵(DTM)和词汇术语矩阵(WTM)为形式化背景生成的术语层次体系进行了比较分析。
Based on the theory of Terms Co-occurrence, this paper uses automatic generation of Concept Lattice in Formal Concept Analysis ( FCA ) to reason the hierarchy structure of domain terminology as attributes and to achieve the visualization display, and then proposes a set of solutions which could implement the construction of Concept Hierarchies in Domain Ontology, including the identification of the semantic relations between documents/words and terms, building of domain formal contexts, generation of topic concepts based FCA, extraction of terminology hierarchical relations based topic concept lattice, and OWL description and graphical display of terminology hierarchy. With "leukemia" domain as an example, the author demonstrates in detail the derivation process from Chinese text to Medical Terminology Hierarchy Structure without the support of knowledge base, and makes a comparative analysis on the terminology hierarchies which are generated based on formal contexts respectively from the Documents-Terms Matrix (DTM) and Words-Terms Matrix ( WTM ).