东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

非平衡进程到达模式下MPI广播的性能优化方法

ISSN号：1000-9825
期刊名称：《软件学报》
时间：0
分类：TP316[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：[1]国防科学技术大学计算机学院,湖南长沙410073
相关基金：国家自然科学创新群体基金（60621003）

关键词：进程到达模式, MPI, 集合通信, MPI_Bcast, 竞争式流水化方法, process arrival pattern, MPI, collective communication, MPI_Bcast, competitive and pipelinedmethod

中文摘要：

为了提高非平衡进程到达（unbalancedprocessarrival，简称UPA）模式下MPI广播的性能对UPA模式下的广播问题进行了理论分析，证明了在多核集群环境中通过节点内多个MPI进程的竞争可以有效减少UPA对MPI广播性能的影响，并在此基础上提出了一种新的优化方法，即竞争式流水化方法（competitiveandpipelinedmethod，简称cP）．CP方法通过一种节点内进程竞争机制在广播过程中尽早启动节点间通信，经该方法优化的广播算法利用共享内存在节点内通信，利用由竞争机制产生的引导进程执行原算法在节点间通信．并且，该方法使节点间通信和节点内通信以流水方式重叠执行，能够有效利用集群系统各节点的多核优势，减少了MPI广播受UPA的影响，提高了性能．为了验证CP方法的有效性，基于此方法优化了3种典型的MPI广播算法，分别适用于不同消息长度的广播．在真实系统中，通过微基准测试和两个实际的应用程序对CP广播进行了性能评价，结果表明，该方法能够有效地提高传统广播算法在UPA模式下的性能．在应用程序的负载测试实验结果中，CP广播的性能较流水化广播的性能提高约16％，较MVAPICH21．2中广播的性能提高18％～24％．

英文摘要：

This paper aims at improving the performance of MPI broadcasts under unbalanced process arrival （UPA） patterns. This paper analyzes this problem with a performance model and proves that the negative impact of UPA on MPI broadcast can be effectively reduced by the competition of intra-node MPI processes on a multicore cluster. Based on this theory, a new optimizing method, called competitive and pipelined method （CP）, is proposed. The CP method can start inter-node communications during the broadcast process through an intra-node competitive mechanism. In a CP method based broadcast algorithm, intra-node communications overlap inter-node communications through a pipelined method, and intra-node communications are implemented through shared memory while inter-node communications are executed by a set of leader MPI processes, which is selected by the competitive mechanism. In order to verify the CP method, this paper improves three typical broadcast algorithms by using this method and evaluates these algorithms in a real platform by using a micro-benchmark case and two practical applications. The results show that the performance of the CP method can effectively improve the performance of broadcast algorithms in the condition of UPA patterns. In the experimental results of the performance of the practical applications, the performance of CP broadcasts is about 16% higher than the performance of P broadcasts and is 18% to 24% higher than the performance of broadcast operation in MVAPICH2 1.2.

同期刊论文项目

千万亿次高性能计算关键技术

期刊论文 36

同项目期刊论文

流处理器上基于参数模型的长流分段技术

SRF Coloring: Stream Register File Allocation via Graph Coloring

Managing Data-Objects in Dynamically Reconfigurable Caches

基于自适应随机行走的可扩展无偏抽样方法

一种基于关键属性的优化数据一致性维护方法

Cell处理器上软件缓存的设计与实现

面向CC-NUMA体系结构的事务内存冲突规避方法

单变量区间线性不等式抽象域

面向非一致Cache的智能多跳提升技术

Imagine流处理器上流的优化组织方法

基于模糊聚类分析的构件并行技术研究

P2P覆盖网中的聚类研究综述

一种基于数据相关性的优化数据一致性维护方法

面向多线程多道程序的加权共享Cache划分

面向多兴趣区域图像处理应用的高效无冲突并行访问存储模型

通过基于COTS器件的软件容错技术提高空间高可靠计算机的性能

基于Game理论的μ-演算公理化

DOOC：一种能够有效消除抖动的软硬件合作管理Cache

FT64并行系统上的EP和GEMM并行算法设计与实现

可选主元LU分解流水线算法设计与FPGA实现

网络距离预测技术研究

高效的部分冗余容错编译：复制错误流关键子图

大规模并行计算机系统硬件故障容错技术综述

并行计算系统度量指标综述

利用冗余进程实现MPI程序错误检测

双核处理器性能最优的共享Cache划分

矩阵LU分解的容错并行算法设计与实现

激光等离子体X射线源的应用

异构环境下MPI通信技术研究

基于Lustre文件系统的MPI检查点系统实现技术与性能测试

StreamJacobi： Efficient implementation of 2-D Jacobi on a stream processor

ETL的符号化模型检验

期刊信息

《软件学报》
北大核心期刊（2011版）

主管单位:中国科学院
主办单位:中国科学院软件研究所中国计算机学会
主编：赵琛
地址：北京8718信箱中国科学院软件研究所
邮编：100190
邮箱：jos@iscas.ac.cn
电话：010-62562563

国际标准刊号：ISSN：1000-9825
国内统一刊号：ISSN：11-2560/TP
邮发代号:82-367

获奖情况:
2001年入选中国期刊方阵“双百期刊”,2000年荣获中国科学院优秀科技期刊一等奖

国内外数据库收录:
俄罗斯文摘杂志,美国数学评论（网络版）,波兰哥白尼索引,德国数学文摘,荷兰文摘与引文数据库,美国工程索引,美国剑桥科学文摘,英国科学文摘数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:54609