在当前主流的众核异构高性能计算机平台上开展超大规模计算流体力学(computational fluid dynamics ,CFD)应用的高效并行数值模拟仍然面临着一系列挑战性技术问题,也是该领域的热点研究问题之一.面向天河2高性能异构并行计算平台,针对高阶精度C FD流场数值模拟程序的高效并行进行了探索,重点讨论了C FD应用特点与众核异构高性能计算机平台特征相适应的性能优化策略,从任务分解、并行度挖掘、多线程优化、SIMD向量化、CPU与加速器协同优化等方面,提出一系列性能提升技术.通过在天河2高性能异构并行计算平台上进行了多个算例的数值模拟,模拟的最大C FD规模达到1228亿个网格点,共使用约59万C P U+M IC处理器核,测试结果表明移植优化后的程序性能提高2.6倍左右,且具有良好的可扩展性.
There still exist great challenges when simulating the large‐scale computational fluid dynamics ( CFD ) applications on the contemporary supercomputer systems with many‐core heterogeneous architecture like Tianhe‐2 ,which is also one of the research hotspots in this field .In this paper ,we focus on exploring the techniques of efficient parallel simulations on the heterogeneous high‐performance computing ( HPC ) platform for large‐scale CFD applications with high‐order accurate scheme .Some approaches and strategies of performance optimization matched with both the characteristic of CFD application and the architectures of heterogeneous HPC platform are proposed from the perspective of task decomposition , exploration of parallelism , optimization for multi‐threaded running ,vectorization by employing single‐instruction multiple‐data (SIMD) ,optimization for the cooperation of both CPUs and co‐processors ,and so on .To evaluate the performance of these techniques ,some numerical experiments are performed on Tianhe‐2 supercomputer system with the maximum number of grid points achieving 1 .228 ×10^11 ,and the total amount of processors and/or co‐processors being 590000 .Such a large‐scale CFD simulation with high‐order accurate scheme has to our best knowledge never been attempted before .It shows that the optimized code can get the speedup of 2 .6X on CPU and co‐processor hybrid platform than that on the CPU platform only ,and perfect scalability is also observed from the test results . The present work redefines the frontier of high performance computing for fluid dynamics simulations on heterogeneous platform .