针对视频会议应用中传统的像素域多路视频混合方法存在运算复杂度高、画面质量损伤的问题,提出了一种基于压缩域的替代方法,并给出了详细的码流映射算法步骤。该方法按照混合后多画面的空间位置关系,通过对输入的多路码流中宏块编码次序的重排和语法元素的映射,在码流级别将多路视频合成到同一画面中,并采取提前量化策略消除可能出现的二次量化失真,从而可兼具处理速度快和高保真的双重优点。以H.263为例验证了此方法的有效性。实验结果表明,与编解码器级联的方法相比,此方法的峰值信噪比(PSNR)平均提高2dB,运算效率提高百倍以上。此研究工作有望为正在制定的国际视频编码标准H.265贡献一种视频混合解决方案。
Aiming at the problems of high computational complexity and picture quality degradation of traditional pixel domain video mixing methods useful for multipoint conferences, this paper proposes a multipoint video composition scheme based on the compressed domain of discrete cosine transform ( DCT), and describes the details of the bitstream mapping algorithm. With the rearrangement of the macroblock coding order and the mapping of the syntax element, the scheme combines multiple channel video frames together into the unique picture on the syntax layer according to the spatial position relation of the composition stream, and then a pre-quantization policy is particularly presented to remove requantization errors. To verify the availability, the details of the algorithm are integrated into the H. 263 codec. The experimental results revealed that compared with the cascaded method, the average peak signal to noise ratio of the proposed method (PSNR) was improved almost 2dB and the operational efficiency increased a hundredfold. It is possible that this research can provide a video mixing solution for the international video coding standard H. 265 which is under development.