为了得到更高的吞吐率和性能功耗比,众核处理器摒弃了复杂的乱序处理器核,而在芯片内集成了大量的轻量级顺序处理器核。为了更好地支持核间数据共享,并减少访问片外存储器带来的开销,众核处理器往往采用共享的末级缓存LLC(Last Level Cache)。因为需要对为数众多相对独立的访问请求作出响应,因此相对于传统多核处理器的末级片内缓存,众核处理器的末级片内缓存更容易产生抖动现象。传统的最久未使用LRU(Least Recent Used)高速缓存替换策略在这种情况下往往无能为力,而几种最新提出的高速缓存替换策略也见效甚微。基于传统的最不经常使用LFU(Least Frequent Used)替换算法,提出一种改进的高速缓存替换算法。相对于LFu替换算法,该算法获取信息的粒度更粗,并且可以掌握更加全局的信息,而这些优势使得该算法更适合作为众核处理器末级片内缓存的替换算法。实验结果表明,在一个64核的众核处理器上,该替换算法可以有效地缓解末级片内缓存的抖动现象,同时该算法实现需要的硬件开销很小。
For higher throughput and performance-power consumption ratio, the many-core processors relinquish the complicated cores of out-of-order processors but integrate in it a great deal the light-weight sequential processor cores. In order to better support the data sharing among the cores and to reduce the overhead caused by accessing off-chip memories, the many-core processors usually employ a shared last level cache (LLC). However, being exposed to very numerous independent requests, LLCs in many-core systems are much more subject to thrashing than they in conventional multi-core processors. Conventional replacement policies for least recent used cache, along with several recently proposed ones, help little in reducing such thrashing. In this paper, we propose an improved cache replacement policy based on classic least frequently used (LFU) policy. Relative to the LFU policy, it has coarser grain in acquiring the information, and can gather information more .globally, these advantages make the algorithm more appropriate to be the replacement policy for LLC of many-core processors. Experimental results on a 64-core many-core architecture show that our method can effectively alleviate many-core's thrashing problems in LLC and achieves this with minimal hardware overhead.