在分析了H.264压缩编码算法中各个模块耗时情况的基础上,结合TMS320DM642的硬件特性对耗时占总程序将近10%的六抽头滤波插值部分进行优化,分别采用了使用内联函数、数据打包、改变垂直方向插值的数据结构、EDMA乒乓缓冲等方法。实验结果证朗,在相同编译选项的情况下,优化以后插值部分的程序运行速度提高了近一倍。
The time-consuming of all modules in H.264 is analyzed, and the profile results show that the 6-tap filter interpolation takes almost 10% time in H.264 encoder. This procedure according to the hardware feature of the TMS320DM642 chip is optimized. The optimizing methods include using intrinsic, packed-data processing, changing the data-base of vertical interpolation procedure and using EDMA PING-PONG buffering. Experiment results show that at the same build options, the processing speed of sub-pixel interpolation is improved by twice.