目前对于分类问题,主要工作集中在文本或Web文档的分类研究,而很少有对deep Web查询接口的分类研究.deep Web源包括查询接口和查询结果,大量的deep Web源的存在,对它们查询接口的分类是通向deep Web分类集成和检索的关键步骤.本分提出一种deep Web本体分类方法,包括:分类本体的概念模型和由此产生的deep Web空间向量模型(VSM).试验表明,这种分类方法具有良好的分类效果,平均准确率达到91.6%,平均查全率达到92.4%.
To date, in terms of the classification, existing works mainly focus on classifying texts or Web documents, and there is little in the deep Web. Many deep Web sources are structured by providing structured query interfaces and results. Classifying query interfaces into domains is one of the critical steps toward the integration of heterogeneous Web sources. In this paper, we present an Ontology-based query interfaces classification, which includes a category Ontology model and a novel weight calculation over Vector Space Model. The experimental results show that we can get a performance with average precision 91.6% and average recall 92.4%.