首先分析研究Web服务描述文档(WSDL文档)的两大特征--结构特征和参引特征,然后根据各个特征对Web服务功能语义描述的影响,提出相应的Web服务表示模型--多向量表示模型.区别于通用文本表示模型,该模型能够显式地表示Web服务描述文档的本质特征.其中,结构特征语义表现在多向量空间的划分上,参引特征语义映射到子向量模型中特征权重的计算上.提出了基于多向量模型的Web服务相似度计算方法,并实现了基于该模型的Web服务发现原型系统.最后,在真实Web服务描述文档集合上构造了一个具有不完全相关性判断且涵盖了1576个WSDL文档的Web服务发现测试集,并在该测试集上进行了基于多向量模型的Web服务发现实验评估.实验结果表明,基于多向量模型的Web服务发现方法的检索效果比基于简单文本向量空间模型发现方法的检索效果在95%的置信度下有了显著提高.
This paper first investigates two main kinds of features of Web service description language (WSDL) documents: the structure features and the reference features. Next, a novel multi-vector model for Web services is introduced, which is distinguished from the general text representation model by the explicit features of Web services. The structure features are represented by multiple vector spaces and the term weighting in the sub-vector is determined by the reference features. A method to compute the similarity between two Web services is proposed and a Web service discovery prototype system based on this new model is implemented. Finally, a Web service discovery test collection is constructed, which has 1576 WSDL documents together with incomplete relevance judgments. The experimental results on this collection show that Web service discovery based on the proposed model is more effective than based on simple vector space model of text with the confidence of 95%.