ARM11 MPCore是最新的嵌入式多核处理器,传统的嵌入式软件在该平台上不能发挥该处理器的所有性能.针对这个问题,对ARM11 MPCore的基本运算能力与ARM9处理器进行了对比分析,提出了使用硬件向量浮点单元及并行计算的软件优化方法,实验结果表明优化后的MPCore浮点处理能力是ARM9系列处理器的浮点处理能力的10倍左右.对于多核架构,可以采用并行计算模型对软件进行优化,试验结果表明,对整数运算采用OpenMP并行计算模型优化后,实验程序的效率提高3.8倍左右.对于嵌入式多媒体处理,提出了使用硬件向量浮点单元对DCT运算进行优化,对视频解码、音频解码以及音视频同步显示采用并行处理的方法进行优化.实验结果表明,这两种优化方法能提高软件在ARM11 MPCore平台上的运行效率,提高了系统的整体性能.
With the wide application of multimedia, network and digital signal process in the embedded field, the requirement of embedded processor's performance was improving. In order to improve the performance and reduce the power consumption, MPSOC (multi processor system on chip) was used to design the embedded system. The ARM11 MPCore was the newest embedded multi-core processor. The traditional embedded software could not run efficiently on this platform, because they only used one processor and did not use VFP module to optimize the float point computation. In order to solve this problem, the basic computation performance of ARMll MPCore and ARM9 was analyzed. The parallel computing method and hardware acceleration method were adopted to optimize the multi core performance for ARMll MPCore processor. Compared with the single-core processor, the ARMll MPCore processor had excellent performance by adopting the optimization methods in the experiment. The results of the experiment showed that the capability could improve by 10 times when the VFP (vector float point) optimization method was used. In the other hand, the efficiency of the program could improve by 3.8 times when the optimization method of parallel computation was used. For the multimedia program the single core embedded processor could not process the decoding of multimedia program, because of the poor capability of float point computation. For the multi-core embedded processor, the traditional embedded software did not suit for the new architecture of MPSOC. In order to deal with this problem, two optimization methods were presented. The one was DCT optimization with VFP which could improve capability of the float point computation greatly. The other one was parallelization of audio decoding, video decoding and playing which could put the three threads into three different processors. The experiments showed that the two optimization methods could improve the performance of software running on ARM11 MPCore platform.