存储器访问速度的发展远远跟不上处理器运算速度的发展,日益严峻的访存速度问题严重制约了处理器速度的进一步发展.降低load—to—use延迟是提高处理器访存性能的关键,在其他条件确定的情况下,增加访存通路的带宽是降低load—to—use延迟的最有效途径,但增加带宽意味着增加访存通路的硬件逻辑复杂度,势必会增加访存通路的功耗.文中的工作立足于分析程序固有的访存特性,探索高带宽访存流水线的设计和优化空间,分析程序访存行为的规律性,并根据这些规律性给出高带宽访存流水线的低复杂度、低延迟、低功耗解决方案.文中的工作大大简化了高带宽访存流水线的设计,降低了关键路径的时延和功耗,被用于指导Godsonx处理器的访存设计.在处理器整体面积增加1.7%的情况下,将访存流水线的带宽提高了一倍,处理器的整体性能平均提高了8.6%.
There is a near-exponential increase in processor speed and memory capacity. However, memory latencies have not improved as dramatically, and access times are increasingly limiting system performance. Low load-to-use latency is a key to approach high memory performance, and increasing the bandwidth of memory pipeline always works. But high bandwidth brings more complexity and needs more power. The authors' work was based on the analysis of the applications, and intend to find the head room of the performance of the memory pipeline. The authors find some useful characters of memory operations was found and give an optimized design of high bandwidth memory pipeline, which has low complexity, low latency and low power. The decisions are used to instruct the design Godsonx processor, although the bandwidth of memory access is doubled and the performance is increased by 8.6%, the extra area is only 1.7% of the original design.