采用CUDA架构对GPU进行编程,对粒子模拟过程中的邻域搜索方法进行了优化,采用并实现了一种基于非排序的邻域搜索方案.通过良好的任务划分和较少的数据交互,设计并实现了一种双GPU的模拟方案.结果分析得知:非排序的邻域搜索方案在粒子数低于10万时总模拟时间降低近50%,粒子数超过50万时降低12%,双GPU在粒子数超过50万时计算时间降低16%,且粒子数愈多性能愈好.
Using CUDA,the near neighbor search approach without sorting was designed and implemented for particle system simulation and single GPU. A double GPU approach was implemented by well task partition and less data exchange. Case studies showed that the near neighbor search approach without sorting reduced the running time by 50% for 100 K particles and 12% for 500 K particles, respectively, and double GPU approach accelerated the simulation by 16% for 500 K particles and generated better results for more particles.