基于改进的线性处理器阵列,提出了一种用于全搜索运动估计的阵列处理器结构,它可以并行执行运算而只要求串行的数据输入.分析表明这种结构不仅执行效率高,而且内部缓冲区很小.由于其简单的结构和规则的数据流,它可以方便地在FPGA器件中实现,用作实时编码器的协处理器.
Block-based motion estimation is essential in video coding applications, which can diminish the temporal redundancy of sequences effectively. In this paper, an efficient VLSI architecture based on an improved linear processor array is proposed, which allows sequential inputs but performs parallel calculations of the full search block matching algorithm. Theoretical analyses show that the architecture achieves a high efficiency while preserving a low local cache size. Owing to its simple structure and regular data flow, this optimized architecture can be easily implemented within a FPGA device as a co-processor of a real-time video encoding chip.