尽管系统噪音对单个进程的影响有限,但对于大规模并行程序性能的影响不容忽视.提出一种基于并行程序计算-通信特征的噪音影响定量评估方法FWQ-MPI,并给出噪音影响的4个量化指标.选取求解稀疏线性代数方程组的3种迭代方法作为研究对象,抽取迭代方法的计算、同步通信特征形成微测试程序;在实际系统上的测试数据明确了系统噪音对并行程序性能的影响机理,并得到系统噪音对并行程序性能影响的若干规律:1)BSP并行程序运行过程中,系统噪音量比例不大,约为2%~6%;2)但系统噪音对BSP并行程序的性能有着较大的影响(当并行规模在1024,2048,4096时,噪音影响比例约为30%~70%);3)其影响随着并行程序规模的扩大而增加,随着2次同步通信间隔内计算量的增加而降低;4)系统噪音的影响主要体现在BSP并行程序的“实际计算通信时间比”要远小于“理想计算通信时间比”.
More attention should be paid on system noise for large-scale parallel application, although the system noise has little impact on one process. One quantitative analysis method for system noise's impact named FWQ-MPI is presented. Four quantitative indicators are given: the proportion of the amount of noise, the proportion of noise impact, the actual/ideal ratio of communication to calculation time. Three iterative methods are selected as the research objects and the micro benchmarks run on a MPP machine with 512 Double six-core nodes. The test results show the impact mechanism of system noise on parallel program performance, and also show several characterizations of system noise as follows. 1)The proportion of the amount of noise is relatively small, accounting for the entire computing time 2% to 6%; 2)But the system noise has much impact( when the parallel scale is 1024,2048 and 4096, the proportion of noise impact is about 30% to 70%); 3)The impact of system noise will be increased when the parallel scale increases, will be decreased when the amount of calculation increases; 4)The impact of system noise is mainly reflected by the “actual ratio of communication to calculation time” is far less than the “ideal ratio of communication to calculation time”.