东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

符号社会网络中正负关系预测算法研究综述

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP393[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]中国人民大学数据工程与知识工程国家教育部重点实验室,北京100872, [2]中国人民大学信息学院,北京100872
相关基金：本课题得到国家自然科学基金（61532021,61272137,61202114）、华为创新研究计划（HIRP20140507）资助.

作者：蓝梦微[1,2], 李翠平[1,2], 王绍卿[1,2,3], 赵衎衎[1,2], 林志侠[1,2], 邹本友[1,2], 陈红[1,2]

关键词：现代硬件处理器, 排序算法, 存储访问层次, 并行优化, 图形处理器, 现场可编程逻辑门阵列, modern hardware processors, sorting algorithm, memory access hierarchy, parallelism optimization, graphics processing unit, field-programmable gate array

中文摘要：

研究了现代硬件上的并行内存排序方法，对其研究现状与进展进行了综述．首先简要阐述了经典排序算法以及排序网络的优缺点，分析其并行优化的适用性，然后从现代CPU处理器设备（多核、配备大内存）、图形处理器（GPU）、现场可编程逻辑门阵列（FPGA）等新型处理器设备介绍现有排序方法的研究成果．处理器设备的架构不同，对排序算法的优化策略也不同，现代CPU主要利用线程的本地存储层次优化数据在存储单元中的排列，以减少访存次数及减少访存缺失，同时利用单指令多数据流技术（SIMD），以提高算法的数据级并行度；GPU则需要将多个线程组织成线程块，依靠共享内存提高线程块的访存速度，而在线程块内则使用单指令多线程（SIMT）技术提高线程的执行效率；FPGA则更靠近于硬件底层，受到自身的资源限制，FPGA的优化策略主要依靠硬件描述语言或高级综合语言优化电路的设计，提高资源利用率的同时增加FPGA的吞吐量．现有的成果表明，GPU的并行内存排序性能优于CPU端上的并行内存排序性能．作者最后对未来的研究方向进行了展望．

英文摘要：

The research achievements of parallel in-memory sorting method on modern hardware are summarized in this paper. Firstly, the advantages and disadvantages of classical sorting algorithms and sorting network are briefly reviewed and their applicability of parallel optimization is analyzed. Then the state-of-the-art sorting methods implemented on the modern CPU processor device （multicore, equipped with large memory）, Graphics Processing Unit （GPU）, Field- Programmable Gate Array （FPGA） and other new processor equipments are introduced. Different optimization strategies about sorting algorithm are utilized on different architecture of processors. Modern CPUs mainly utilize thread local memory level with aligned data in order to reduce the frequency of memory access and memory miss. To improve data level parallelism of sorting methods, Single Instruction Multiple Data （SIMD） technology is also involved; Threads of a GPU are organized into thread blocks, using Shared Memory to improve the speed of memory access. Single Instruction Multiple Threads（SIMT） is utilized in thread blocks to improve the execution efficiency of threads; Compared to CPU and GPU, the FPGA is closer to the underlyinghardware, limited by its own resources. Therefore optimization strategies of the FPGA are to optimize the design of circuit using hardware description language or high level synthesis language to make the use of resources more efficient and improve the through of FPGA. According to recent achievements, GPUs have a better performance than CPUs on sorting. Finally, the research interests for further study are proposed in the final section.

同期刊论文项目