东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

An Online Visualization System for Streaming Log Data of Computing Clusters

ISSN号：1007-0214
期刊名称：Tsinghua Science and Technology
时间：2013.8.8
页码：196-205
分类：TP391.41[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术] TP301.6[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术]
作者机构：[1]Zhejiang Univ, State Key Lab CAD & CG, Hangzhou 310058, Zhejiang, Peoples R China, [2]Hangzhou Dianzi Univ, Coll Comp Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
相关基金：supported by the National Natural Science Foundation of China (Nos. 61232012 and 61202279);the National High-Tech Research and Development (863) Program of China (No. 2012AA120903);the Doctoral Fund of Ministry of Education of China (No. 20120101110134)
相关项目：大规模复杂动态图可视化关键技术研究

作者： Jing Xia, Feiran Wu, Cong Xie, Zhen Liu, Wei Chen|

关键词：可视化系统, 计算集群, 日志数据, 数据流, 数据处理器, 可视化技术, 在线, 时间变化, computing cluster, performance metrics monitoring, streaming data, visualization

中文摘要：

Monitoring a computing cluster requires collecting and understanding log data generated at the core, computer, and cluster levels at run time. Visualizing the log data of a computing cluster is a challenging problem due to the complexity of the underlying dataset: it is streaming, hierarchical, heterogeneous, and multi-sourced. This paper presents an integrated visualization system that employs a two-stage streaming process mode. Prior to the visual display of the multi-sourced information, the data generated from the clusters is gathered, cleaned, and modeled within a data processor. The visualization supported by a visual computing processor consists of a set of multivariate and time variant visualization techniques, including time sequence chart, treemap, and parallel coordinates. Novel techniques to illustrate the time tendency and abnormal status are also introduced. We demonstrate the effectiveness and scalability of the proposed system framework on a commodity cloud-computing platform.

英文摘要：

Monitoring a computing cluster requires collecting and understanding log data generated at the core, computer, and cluster levels at run time. Visualizing the log data of a computing cluster is a challenging problem due to the complexity of the underlying dataset： it is streaming, hierarchical, heterogeneous, and multi-sourced. This paper presents an integrated visualization system that employs a two-stage streaming process mode. Prior to the visual display of the multi-sourced information, the data generated from the clusters is gathered, cleaned, and modeled within a data processor. The visualization supported by a visual computing processor consists of a set of multivariate and time variant visualization techniques, including time sequence chart, treemap, and parallel coordinates. Novel techniques to illustrate the time tendency and abnormal status are also introduced. We demonstrate the effectiveness and scalability of the proposed system framework on a commodity cloud-computing platform.

同期刊论文项目