访存带宽是限制众核处理器性能提升的关键,将片上最后一级Cache设计为所有处理器核共享是必要的,在共享Cache中隔离放置冲突的数据,是提高共享Cache性能的关键.文中提出了缓存块链接的硬件方法,用于隔离共享Cache中不同线程之间的数据.文中基于时钟精准的片上众核结构模拟器,使用Sptash2程序组和生物信息学中的任务,对所提机制进行了评估.实验结果表明,与传统共享Cache相比,使用缓存块链接机制时,使得共享Cache的冲突性缺失率降低约20%,而使得IPC平均提高了约10%.
Memory bandwidth is critical to overall performance, especially for on-chip many-core architecture. It may be necessary to design a shared last level on-chip cache, to eliminate capacity wasted by multiple copies of one data block in private caches. However, when it comes to on-chip architecture, the conflict in shared cache becomes more serious than traditional single processor architecture. It is crucial to isolate conflicting data blocks in shared cache. This paper proposes a novel hardware approach, that is, block agglutinating, to isolate blocks of different threads in shared cache. Extensive analysis of the proposed scheme with Splash2 benchmarks and Bioinfor- matics workloads is performed using a cycle accurate many-core processor simulator. Experimen- tal results show that when using block agglutinating, it makes an average reduction by about 20% in conflict miss rate of shared cache compared to the traditional shared cache, and it makes IPC improved by about 10%.