位置:成果数据库 > 期刊 > 期刊详情页
Improving vertex-frontier based GPU breadth-first search
  • 期刊名称:Journal?of?Central?South?University
  • 时间:2014.10.26
  • 页码:3828-3836
  • 分类:TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术] TP334.7[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China, [2]College of Computer, National University of Defense Technology, Changsha 410073, China, [3]Department of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China, [4]Avatar Science Company, Guangzhou 510001, China
  • 相关基金:Projects(61272142,61103082,61003075,61170261,61103193)supported by the National Natural Science Foundation of China; Project supported by the Program for New Century Excellent Talents in University of China; Projects(2012AA01A301,2012AA010901)supported by the National High Technology Research and Development Program of China
  • 相关项目:基于动态测试用例生成的二进制软件缺陷自动发掘技术研究
中文摘要:

宽度优先的搜索(BFS ) 是为图遍历的一个重要内核并且被处理应用程序的许多图使用了。广泛的研究在增加 BFS 的表演被奉献了。作为最有效的答案, GPU 加速完成 3.3 ?獡氠慥楤杮愠摮琠慲汩湩 ? 摥敧朠潥敭牴敩 ? 爨摡畩 ? 湡 ? 桴捩湫獥 ? 的最先进的结果 ? 桴楥? 慭楸畭 ? 牰景汩 ? 桴捩湫獥 ? 琠敨物挠潨摲氠湥瑧 ? 湡 ? 桴楥? 瑳条敧 ?湡汧 吗?

英文摘要:

Breadth-first search(BFS) is an important kernel for graph traversal and has been used by many graph processing applications. Extensive studies have been devoted in boosting the performance of BFS. As the most effective solution, GPU-acceleration achieves the state-of-the-art result of 3.3×10^9 traversed edges per second on a NVIDIA Tesla C2050 GPU. A novel vertex frontier based GPU BFS algorithm is proposed, and its main features are three-fold. Firstly, to obtain a better workload balance for irregular graphs, a virtual-queue task decomposition and mapping strategy is introduced for vertex frontier expanding. Secondly, a global deduplicate detection scheme is proposed to remove reduplicative vertices from vertex frontier effectively. Finally, a GPU-based bottom-up BFS approach is employed to process large frontier. The experimental results demonstrate that the algorithm can achieve 10% improvement over the state-of-the-art method on diverse graphs. Especially, it exhibits 2-3 times speedup on low-diameter and scale-free graphs over the state-of-the-art on a NVIDIA Tesla K20 c GPU, reaching a peak traversal rate of 11.2×10^9 edges/s.

同期刊论文项目
期刊论文 22 会议论文 4 著作 1
期刊论文 20 会议论文 9
同项目期刊论文