Explicit Data Graph Execution(EDGE)ISA是一种专门为类数据流驱动的分片式众核处理器而设计的指令集体系结构.相较于传统的采用控制流驱动的处理器,EDGE结构以超块(Hyperblock)而不是单个指令作为其执行单位,在超块内部实现数据流执行,超块之间按照推测序保持控制流执行,有利于挖掘指令级并行性.但是,EDGE编译器按照程序的串行执行顺序组织超块,超块间和超块内部受限于数据依赖,削弱了整个程序运行时的潜在数据级并行性和线程级并行性,不利于发挥EDGE分片式结构的优势.本文通过分析EDGE编译器超块组织的特点,结合EDGE结构特有的执行模型,提出一种普适性的超块组织框架来模拟EDGE结构上多线程运行的效果,进一步挖掘EDGE结构运行串行单线程程序时的指令级并行性.本文选用TRIPS微处理器作为EDGE结构的实例处理器,利用矩阵乘法等三个实验验证了我们所提出的框架的可行性,实验结果表明这些应用在TRIPS上获得了较好的性能提升.
Explicit Data Graph Execution(EDGE) ISA is specially designed for distributed multi-core processors.EDGE architectures take Hyperblocks,composed of EDGE instructions,as their execution elements rather than individual instructions.Programs are executed in intra-block data-driven model and speculatively in inter-block control flow order.These two features are beneficial for exploiting programs' instruction level parallelism(ILP).However,EDGE compiler partitions a program into blocks in sequential order,which probably weakens the potential data level parallelism(DLP) and thread level parallelism(TLP) of run-time programs.Considering the specified execution models of EDGE architectures,this paper analyzes how compiler generates Hyperblocks,then proposes a general framework,in which we adopt a new approach to form Hyperblocks,in order to make single-threaded programs run in a multithreading way,eventually further exploiting single-threaded programs' ILP.The paper at last uses three experiments,one of which is matrix multiplication,to validate our framework's effectiveness and generality.Experiment results show that the TRIPS microarchitecture-an EDGE prototype-can obtain high performance when running these applications.