东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

分布式流处理技术综述

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]山东大学计算机科学与技术学院,济南250101
相关基金：基金项目：国家自然科学基金项目（61272092）;山东省自然科学基金项目（ZR2012FZ004）;山东省科技发展计划基金项目（2014GGE27178）;国家“九七三”重点基础研究发展计划基金项目（2015CB352500）;山东大学自主创新基金项目（2012ZD012）;泰山学者计划基金项目

关键词：大数据, 数据流, 分布式流处理, 实时处理, 分布式系统, big data, data stream, distributed stream processing, real-time processing, distributed system

中文摘要：

随着计算机和网络技术的迅猛发展以及数据获取手段的不断丰富，在越来越多的领域出现了对海量、高速数据进行实时处理的需求．由于此类需求往往超出传统数据处理技术的能力，分布式流处理模式应运而生．首先回顾分布式流处理技术产生的背景以及技术演进过程，然后将其与其他相关大数据处理技术进行对比，以界定分布式流数据处理的外延．进而对分布式流处理所需要考虑的数据模型、系统模型、存储管理、语义保障、负载控制、系统容错等主要问题进行深入分析，指出现有解决方案的优势和不足．随后，介绍S4，Storm，Spark Streaming等几种具有代表性的分布式流处理系统，并对它们进行系统地对比．最后，给出分布式流处理在社交媒体处理等领域的几种典型应用，并探讨分布式流处理领域进一步的研究方向．

英文摘要：

The rapid growth of computing and networking technologies, along with the increasingly richer ways of data acquisition, has brought forth a large array of applications that require real-time processing of massive data with high velocity. As the processing of such data often exceeds the capacity of existing technologies, there has appeared a class of approaches following the distributed stream processing paradigm. In this survey, we first review the application background of distributed stream processing and discuss how the technology has evolved to its current form. We then contrast it with other big data processing technologies to help the readers better understand the characteristics of distributed stream processing. We provide an in-depth discussion of the main issues involved in distributed stream processing, such as data models, system models, storage management, semantic guarantees, load control, and fault tolerance, pointing out the pros and cons of existing solutions. This is followed by a systematic comparison of several popular distributed stream processing platforms including $4, Storm, Spark Streaming, etc. Finally, we present a few typical applications of distributed stream processing and discuss possible directions for future research in this area.

同期刊论文项目