异构架构迅速发展,依靠编译器来挖掘应用程序的数据局部性、充分发挥加速设备片上cache的硬件优势,是十分重要的.然而,传统的重用距离在异构背景下面临平台差异性挑战,缺乏统一的计算框架.为了更好地刻画和优化异构程序的局部性,建立了一个多平台统一的重用距离计算机制和数据布局优化框架.该框架根据应用在异构架构下的并行执行方式,从统计平均的角度提出了放松重用距离,并以OpenCL程序为例给出了它的计算方法,为多平台数据布局优化决策提供统一的依据.为了验证该方法的有效性,在Intel Xeon Phi,AMD Opteron CPU,Tilera Tile GX-36这3个平台上进行了实验,结果表明,该方法在多平台上可获得至少平均1.14x的加速比.
With the rapid development of heterogeneous system, it's important to enhance data locality and fully utilize on-chip cache via compiler. However, classic reuse distance criteria exhibites platform-sensitive attribute in heterogeneous systems, therefore a unified reused distance calculation framework is needed for compiler to describe and optimize data locality. This paper proposes relaxed reuse distance with a unified calculation method in OpenCL programs as criteria for data layout optimization. Relaxed reuse distance is calculated with heterogeneous execution models and statistical approximation. Experiments are conducted on lntel Xeon Phi, AMD Opteron CPU, and Tilera Tile-GX36, and results show that this optimization can achieve at least 1.23x speedup on average.