针对较大循环在可重构处理器上的映射问题提出了一种启发式的算法,将循环划分为在处理器上执行的软件部分和在可重构阵列上执行的硬件部分,并且使两者之间的数据传输量最小。通过测试,相比于原有处理较大循环的方法,该技术降低了13%-29%的循环执行时间。在FPGA验证系统上通过H.264中的运动估计和MPEG-2中的IDCT等多种多媒体核心算法验证了该划分技术。使用该划分技术后,验证系统相比于类似结构在不增加硬件规模的情况下,有平均3.5倍的性能提升。
Based on the idea of mapping the loops onto the reconfigurable array can improve the performance, the paper proposes a heuristic hardware-software partition algorithm which partitions the big loop into two parts: one is in the processor and another in the array. The objective of the algorithm is to minimize the communication of the two parts. The partition algorithm reduces 13%-29% of execution time compared with original teehniques.The technology has been verified on the platform of FPGA with some kernel algorithms of multimedia applications, such Motion Estimation of H.264, and IDCT of MPEG-2. With the same scale of reeonfigurable array, the performance is 3.5 times higher than the similar researches.