语义网的实现需要为网络上现有的和新的文档进行广泛可用的语义标注,使其内容可被机器所识别和理解。语义标注是清晰、明确、容易理解的,可作为大量应用的服务基础,适用于多种文本,包括网页、普通(非网络)文档、数据库中的文本等。本文根据语义标注的研究历程,介绍了国内外面向文本的语义标注研究现状,对语义标注所使用的技术进行总结;在已有语义标注方法分类的基础上,对现有的标注方法进行分类分析;指出了近年来语义标注方法的不足,并探讨了面向文本文档的语义标注发展趋势。
The realization of the Semantic Web requires the widespread availability of semantic annotations for existing and new documents on the Web. Semantic annotation is clearly specified, easy to understand, can serve as a basis for number of useful applications, and is applicable to any sort of text-web pages, regular (non-web) documents, text fields in databases, etc. In this paper, we provide an overview of semantic annotation of text documents and summarize the techniques used in semantic annotation research according to the research process of semantic annotation. The classification analysis of the existing annotation methods is achieved based on the previous classification. In addition, we pointed out the shortcomings of semantic annotation methods in recent years ,and discuss the development trend of semantic annotation of text document.