位置:成果数据库 > 期刊 > 期刊详情页
云计算环境中面向OLTP应用的数据分布研究
  • ISSN号:0010-4620
  • 期刊名称:Computer Journal
  • 时间:0
  • 页码:-
  • 分类:TP392[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]中国人民大学数据工程与知识工程教育部重点实验室,北京100872, [2]最高人民法院信息中心,北京100745, [3]中国人民大学信息学院,北京100872, [4]北京航空航天大学软件开发环境国家重点实验室,北京100191
  • 相关基金:软件开发环境国家重点实验室开放基金(SKLSDE-2012KF-09); 国家自然科学基金(61003086,61170010)资助
  • 相关项目:云计算环境下海量RDF数据管理系统核心技术研究
中文摘要:

云计算为大型OLTP应用中分布式数据的高效存储和管理带来了新的机遇,大数据则对分布式数据的存储与管理提出了新的挑战,自动数据分布逐渐成为分布式系统中的研究重点和难点.该文对影响数据分布问题的三要素数据、负载和节点进行分析,将该问题抽象为数据分片、数据分配和负载执行3个相互关联的子问题,提出了数据分布问题的三角架构DaWN.由于不同的系统有不同的应用需求,DaWN架构以代价模型为枢纽,对特定应用需要达到的效能目标和资源限制进行调配,并提出了数据分布问题所面临的技术挑战.该文对DaWN架构中以顶点为代表的3个基本要素进行详细分析,着重对以边为代表的3条关联关系进行阐释,并据此对云环境中大规模OLTP应用的数据分片、数据分配和负载执行3个数据分布子问题的研究成果和进展进行归纳和总结.基于以上分析,该文以数据分片、数据分片和负载执行为变量,使用真值表覆盖数据分布问题中的8种类型,并采用三维立体坐标系的方式对相关工作的分布进行归纳总结和呈现.最后,该文从代价模型研究、测试基准研究、自动化数据分布技术研究、特定应用研究等4个角度,对数据分布问题的未来发展方向进行展望.

英文摘要:

Cloud computing raises new opportunities and challenges in efficient distributed data storage and management for large scale OLTP applications. In the fields of Data Management, data distribution is one of the most famous technologies for platform scalability. With the dramatic increase of data volume, automatic data distribution has been one of the key techniques and intractable problem for distributed systems. Focusing on the problem of data distribution in cloud environment, this work first studies the three essential elements in this field, which are data, workload and node. Based on these analyzing, it summarizes their relationships with each other as data fragmentation, data allocation and workload processing, and abstracts the problem of data distribution as a triangle model called DAWN, which uses the three essential elements as the triangle's vertexes separately, and encodes their relationships as the edges. As different systems may have different requirements, DaWN utilizes cost model as the core to obtain the performance goals under certain resource limitations for any specific application, and presents the main challenges in data distribution. This work analyzes the various characters of the three key elements in DAWN, elaborates their relationships separately including data fragmentation, data allocation and workload processing, and provides a taxonomy and peroration of the latest research for large scale OLTP applications in cloud environment. Based on these analyses, this work uses data fragmentation, data allocation and workload processing as parameters, provides a truth table to cover all 8 kinds of possibilities in data distribution, and presents these with related works in a cube like three-dimensional coordinate system. Meanwhile, this work also prospects the future work in the problem of data distribution in cloud environment, including direction studies on cost models, benchmarks, automatic technologies and specific applications.

同期刊论文项目
期刊论文 1 会议论文 2 专利 1
同项目期刊论文