对GPU(graphic process unit)、基于GPU的通用计算(general purpose GPU,GPGPU)、基于GPU的编程模型与环境进行了界定;将GPU的发展分为4个阶段,阐述了GPU的架构由非统一的渲染架构到统一的渲染架构,再到新一代的费米架构的变化;通过对基于GPU的通用计算的架构与多核CPU架构、分布式集群架构进行了软硬件的对比.分析表明:当进行中粒度的线程级数据密集型并行运算时,采用多核多线程并行;当进行粗粒度的网络密集型并行运算时,采用集群并行;当进行细粒度的计算密集型并行运算时,采用GPU通用计算并行.最后本文展示了未来的GPGPU的研究热点和发展方向——GPGPU自动并行化、CUDA对多种语言的支持、CUDA的性能优化,并介绍了GPGPU的一些典型应用.
This paper defines the outline of GPU(graphic processing unit), the general purpose computation, the programming model and the environment for GPU. Besides, it introduces the evolution process from GPU to GPGPU (general purpose graphic processing unit), and the change from non-uniform render architecture to the unified render architecture and the next Fermi architecture in details. Then it compares GPGPU architecture with multi-core GPU architecture and distributed cluster architecture from the perspective of software and hardware. When doing the middle grain level thread data intensive parallel computing, the multi-core and multi-thread should be utilized. When doing the coarse grain network computing, the cluster computing should be utilized. When doing the fine grained compute intensive parallel computing, the general purpose computation should be adopted. Meanwhile, some classical applica- tions of GPGPU have been mentioned. At last, this paper demonstrates the further developments and research hotspots of GPGPU, which are automatic parallelization of GPGPU, multi-language support and performance optimi- zation of CUDA, and introduces the classic application of GPGPU.