为了对异质异构数据资源进行语义集成并提供统一的智能访问接口,利用语义Web技术发布机器可理解的数据资源及其之间的关系,以支持智能搜索等功能.介绍了中医药智能搜索引擎TCMSearch,该搜索引擎的核心为一个集成语义知识库,该知识库利用领域本体来表示中医药领域的实例及其之间的关系.首先,针对普通文本,系统采用了机器学习的方法对其进行语义标注 对于关系型数据库数据,则采用了语义映射的方法统一其语义信息.然后,系统为集成的数据资源构建了一个语义索引,该索引采用本体语言RDF/OWL进行表示,从而支持一些强大的推理功能,如类层次关系推理和实例关系推理.最后,通过利用该语义索引以及其支持的推理功能,系统能够在集成知识库的基础上提供智能化搜索,如关联搜索、语义图浏览以及实例推荐等新功能.
To semantically integrate heterogeneous resources and provide a unified intelligent access interface, semantic web technology is exploited to publish and interlink machineunderstandable resources so that intelligent search can be supported. TCMSearch, a deployed intelligent search engine for traditional Chinese medicine (TCM), is presented. The core of the system is an integrated knowledge base that uses a TCM domain ontology to represent the instances and relationships in TCM. Machine-learning techniques are used to generate semantic annotations for texts and semantic mappings for relational databases, and then a semantic index is constructed for these resources. The major benefit of representing the semantic index in RDF/OWL is to support some powerful reasoning functions, such as class hierarchies and relation inferences. By combining resource integration with reasoning, the knowledge base can support some intelligent search paradigms besides keyword search, such as correlated search, semantic graph navigation and concept recommendation.