在模糊XML数据管理中,模糊XML文档和模糊DTD的相似性是模糊XML数据整合、模糊XML文档聚类的关键步骤.为了研究模糊XML文档和模糊DTD的相似性,对模糊DTD树进行了规则变换,主要解决元素和属性的析取约束和基数约束问题,即由析取范式转化为合取范式,将元素或属性的重复次数确定化,然后利用树编辑距离算法对模糊XML文档树和转化后的模糊DTD树集合进行相似性对比.通过实验验证了所提方法的性能优势.
In fuzzy extensible markup language (XML) data management, the similarity between fuzzy XML document and fuzzy document type definition (DTD) is a key step of fuzzy XML data integration and fuzzy XML documents clustering. In order to study the similarity,the fuzzy DTD tree are transformed by rules,which mainly solves the disjunctive constraint and cardinality constraint problems of the elements and attributes, namely the transformation from disjunctive normal form into conjunctive normal form,thus the number of repetitions of elements or attributes being determined. And then,the tree edit distance algorithm is used to compare the similarity between the fuzzy XML document tree and the transformed fuzzy DTD tree. The advantages of the proposed method are verified by experiments.