GPU最初是专为图形渲染而设计的,近年来已经演化为高并行度、多线程、具有强大计算能力和极高存储器带宽的通用多核处理器,目前主流GPU的峰值计算能力通常可达CPU的数10倍。这提供了1种解决大计算量难题的新的可能。分子动力学模拟需要极强的计算能力,故使用GPU来进行分子动力学模拟的尝试是很自然的选择。本文基于NVIDIA的GeForce GTX295GPU和CUDA2.3开发环境实现了范德华力计算、范德华势能计算和基于网格的邻居搜索。在邻居搜索算法实现中,对于不同计算能力的GPU给出了不同的实现策略。对36万粒子规模的高分子聚乙烯体系算例的测试表明:1个时间步的计算结果与计算性能突出的分子动力学软件GROMACS相应的计算结果一致(运行在工作站Intel Xeon E 5405上),相对于CPU单核计算性能有大幅提高,其中邻居搜索加速了17倍,范德华力计算加速了47倍;并且解决了邻居搜索时的边界问题。虽然本文是针对范德华力的计算,但是策略是通用的,其他方向的研究人员也可以参考。测试结果表明,使用GPU来加速较大规模计算量的计算是可取的。
GPU(graphics processing units) originally designed for graphics rendering,lately has evolved into a highly parallel,multithreaded,many-core processor with tremendous computational horsepower and very high memory bandwidth.Mainstream GPUs far exceed CPUs in terms of raw computing power.It provides a new way to solve data-intensive problem.Molecular dynamics simulation is extremely computationally demanding,which makes them a natural candidate for implementation on GPU.We have implemented van der Waals force calculation and neighbor searching on GeForce GTX295 and CUDA toolkit 2.3 under Ubuntu Linux 9.04 Desktop Operating System.We have demonstrated two ways to implement neighbor searching and compared their performances.The test data of one time step calculation on GPU for 360000 particles which comprise 1200 equally PE chains conforms to the output of single threaded calculation by GROMACS 4.05 on one core of the Intel Xeon(R) 5405@2.0 GHZ CPU in the dell workstation,and speedup of 17 in neighbor searching and 47 in van der walls force computing were obtained.The strategies for implementation of non-bonded force computing on GPU described in this paper suggests that GPU accelerated calculation of large-scale molecular dynamics could be expected.