Speeded Up Robust Feature(SURF)算法是在计算机视觉领域得到广泛应用的一种图像兴趣点检测和匹配方法。开放计算语言(OpenCL)提供了一个在异构体系结构上,包括GPU,CPU及其他类型处理器,编写并行程序的框架。本文介绍了如何在通用GPU和OpenCL平台上,对SURF算法进行优化与实现。本文对其中一些优化方法,例如kernel线程的配置,局部内存的使用方法等,进行了详细的对比和讨论。最终实现的OpenCL版本的算法在NVidiaGTX260平台上获得了比原始的CPU版本在IntelDual—CoreE54002.7G处理器上至少21倍的加速。
Speeded-Up Robust Feature(SURF)algorithm is widely used for image feature detecting and matching in computer vision area. Open Computing Language(OpenCL) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. This paper introduces how to implement and optimize SURF algorithm on General Purpose GPU and OpenCL, and discusses some optimization methods such as configuring the kernel threads, using local memory in details. The final OpenCL version on Nvidia GTX 260 is more than 21 times faster than its original CPU version on Intel Dual-Core E5400 2.7G.