位置:成果数据库 > 期刊 > 期刊详情页
一种构建StreamCube的超大维表连接算法
  • ISSN号:1000-1239
  • 期刊名称:《计算机研究与发展》
  • 时间:0
  • 分类:TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]国防科学技术大学计算机学院,长沙410073, [2]长沙民政职业技术学院软件学院,长沙410004
  • 相关基金:国家“八六三”高技术研究发展计划基金项目(2007AA010502 2007AA01Z474 2006AA01Z451); 教育部新世纪优秀人才支持计划基金项目(NCET-06-0928)
中文摘要:

表连接是关系数据库中最重要的操作之一,在数据流管理系统中同样重要.构建StreamCube的聚集查询时,数据流与超大维表(如IPaddress维表)作表连接将耗费大量有限的计算资源和内存.超大维表需划分为多个块,分块读入内存,造成磁盘I/O频繁.根据维表及其连接键层的特性,降低维表与数据流连接的连接键冗余,将维表无损压缩为可装入内存的连接键范围维表(RJ-DT),引出数据流上非等值连接问题;并提出一种超大维表多表连接算法——多动态索引嵌套循环连接算法(multi dynamicindex nested-loopjoin),该算法实现数据流与压缩维表高效的非等值连接,并拓展为多表连接.理论分析及实验结果表明,该算法可使超大维表连接性能明显改善,最高可达到一个数量级的加速并具有很强的实用性.

英文摘要:

Join is one of the most important operations in relational database,and is also important in data stream management system.In group-bys which construct StreamCube,join will be done before them,and join between data stream and huge dimension tables(such as IPaddress table) would consume limited power of CPU and capacity of memory.Generally,a huge dimension table must be partitioned into small tables and each partition table is loaded into memory in turn that causes frequent disk IO.To avoid this shortage,it compress huge dimension tables losslessly by taking characters of dimension tables and their join-key layer into account and finding join-key redundancies in those tables.So,one dimension table with n concept columns is compressed into n ranged join-key dimension tables(RJ-DT) by reducing join-key redundancies and using decomposed of storage model of column-store.Each RJ-DT is composed of start and end columns and several concept columns.However,a new issue that non-equijoin called range join between data stream and RJ-DT is brought out.Then,it proposes a multi-join algorithm of huge dimension table,called multi dynamic index nested-loop join(MDI-NL),which implements non-equijoin efficiently,also supports multi-join.MDI-NL constructs RB+Tree index of each RJ-DT before join.In join operation,it dispatches index dynamically referring to demand of group-by which get the exact smallest index and makes MDI-NL more powerful.Through theoretical analysis and extensive experiments,it is found that MDI-NL outperforms other join algorithms by an order of magnitude for huge dimension table join and has a strong practicability.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《计算机研究与发展》
  • 中国科技核心期刊
  • 主管单位:中国科学院
  • 主办单位:中国科学院计算技术研究所
  • 主编:徐志伟
  • 地址:北京市科学院南路6号中科院计算所
  • 邮编:100190
  • 邮箱:crad@ict.ac.cn
  • 电话:010-62620696 62600350
  • 国际标准刊号:ISSN:1000-1239
  • 国内统一刊号:ISSN:11-1777/TP
  • 邮发代号:2-654
  • 获奖情况:
  • 2001-2007百种中国杰出学术期刊,2008中国精品科...,中国期刊方阵“双效”期刊
  • 国内外数据库收录:
  • 俄罗斯文摘杂志,荷兰文摘与引文数据库,美国工程索引,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国北大核心期刊(2000版)
  • 被引量:40349