图形处理单位(GPU ) 在最近的年里在通用计算市场起了一个重要作用。目前,编程 GPU 单位的普通途径是与象 CUDA 那样的低级 GPU API 写 GPU 特定的代码。尽管这条途径能完成好性能,当程序员们被要求为各潜在的目标建筑学写代码的一个特定的版本,它创造严肃的可移植性问题。这导致高开发和维护费用。我们相信有提供源代码可移植性在之间的一个编程模型是合乎需要的中央处理器和 GPU,以及不同 GPU。这将允许程序员写代码的一个版本,没有修正,它能高效地在中央处理器或 GPU 上被编并且执行。在这篇论文,我们建议 MapCG,提供在中央处理器和 GPU 之间的源代码水平可移植性的一个 MapReduce 框架。与象 OpenCL 那样的另外的途径相对照,基于 MapReduce,我们的框架提供一个高级编程模型并且使编程容易得多。我们描述 MapCG 的设计,包括中央处理器和 GPU 上的 MapReduce 风格高级编程框架和运行系统。MapCG 运行时刻,支持的多核心中央处理器和 NVIDIA GPU 的一个原型,被实现。我们的试验性的结果证明这实现能在多核心中央处理器平台和 GPU 上高效地执行一样的源代码,完成在八个通常使用的应用程序上的 MapReduce 的以前的实现上的 1.6 ~ 2.5x 的平均加速。
Graphics processing units (GPU) have taken an important role in the general purpose computing market in recent years. At present, the common approach to programming GPU units is to write CPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve good performance, it creates serious portability issues as programmers are required to write a specific version of the code for each potential target architecture. This results in high development and maintenance costs. We believe it is desirable to have a programming model which provides source code portability between CPUs and GPUs, as well as different GPUs. This would allow programmers to write one version of the code, which can be compiled and executed on either CPUs or GPUs efficiently without modification. In this paper, we propose MapCG, a MapReduce framework to provide source code level portability between CPUs and GPUs. In contrast to other approaches such as OpenCL, our framework, based on MapReduce, provides a high level programming model and makes programming much easier. We describe the design of MapCG, including the MapReduce-style high-level programming framework and the runtime system on the CPU and GPU. A prototype of the MapCG runtime, supporting multi-core CPUs and NVIDIA GPUs, was implemented. Our experimental results show that this implementation can execute the same source code efficiently on multi-core CPU platforms and GPUs, achieving an average speedup of 1.6-2.5x over previous implementations of MapReduce on eight commonly used applications.