位置:成果数据库 > 期刊 > 期刊详情页
跨本体概念间相似度的计算方法——MD4模型
  • ISSN号:1671-1815
  • 期刊名称:《科学技术与工程》
  • 时间:0
  • 分类:TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]国防科学技术大学信息系统工程重点实验室,长沙410073
  • 相关基金:国家自然科学基金项目(60172012)
中文摘要:

随着XML技术的不断应用和推广,XML结构聚类技术在XML管理与挖掘中扮演着重要角色.针对目前XML结构聚类算法聚类不准确、效率低、对数据输入次序敏感的不足,提出簇核心的概念,并指出在动态环境下,对簇核心加以正确维护可以支持增量式聚类.在此基础上设计了一套有效的、XML结构聚类算法COXClustering,该算法涵盖静态聚类和增量式聚类,静态聚类提取子树作为特征合理反映XML结构之间的相似性,并利用簇核心快速分类的特点提高聚类效率,利用簇核心正交的特点降低对数据输入次序的敏感性;增量式聚类根据当前增加的XML文档动态调整簇核心,从而自适应地指导增量式聚类.理论分析和实验表明该算法静态聚类效率高、聚类质量好、能够有效屏蔽输入次序的敏感性,增量式聚类将聚类速度大幅度提升,聚类质量接近静态聚类质量.

英文摘要:

With the increasing applications and developments of XML, XML structural clustering plays an important role both in management and in mining of XML documents. Although many XML structural clustering algorithms are proposed, they are ineffective, inefficient and sensitive to input order in practice. In addition, they can't satisfy incremental clustering under some certain background. This paper addresses these problems by proposing a novel concept--cluster-core, and points out that incremental clustering can be supported if the cluster-cores are mantained correctly in dynamic environment. An effective XML structural clustering algorithm, COXClustering, is presented, which covers static clustering and incremental clustering. In static clustering, COXClustering extracts sub-trees to measure similarity between XML structures, and it utilizes classification to improve clustering efficiency and reduces sensitivity to input order by the orthogonality of cluster-cores. In incremental clustering, it dynamically adjusts cluster-cores based on current added XML documents, and then guides incremental clustering through both instant adjustment and batch adjustment adaptively. Finally, a comprehensive experiment on both synthetic and real dataset is conducted to show that COXClustering is capable of improving clustering efficiency and quality, as well as being insensitive to input order in static clustering. The experiment also shows that incremental clustering highly speeds up clustering and the quality of incremental clustering is close to that of static clustering.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《科学技术与工程》
  • 北大核心期刊(2011版)
  • 主管单位:中国科学技术协会
  • 主办单位:中国技术经济学会
  • 主编:明廷华
  • 地址:北京市学院南路86号
  • 邮编:100081
  • 邮箱:ste@periodicals.net.cn
  • 电话:010-62118920
  • 国际标准刊号:ISSN:1671-1815
  • 国内统一刊号:ISSN:11-4688/T
  • 邮发代号:2-734
  • 获奖情况:
  • 国内外数据库收录:
  • 中国中国科技核心期刊,中国北大核心期刊(2011版),中国北大核心期刊(2014版)
  • 被引量:29478