该文以多视角同步视频为输入,提出综合利用形状和运动信息的3维人体姿态估计方法。该方法将人体分为头、躯干和四肢等3部分,每部分利用运动信息来预测当前的状态,并以形状信息作为检测器来确定姿态。这种在姿态估计中使用互补信息的方式极大地解决了漂移和收敛到局部极小的问题,也使系统能自动初始化和失败后重初始化。同时,多视角数据的使用也解决了自遮挡问题和运动歧义性。在包含多种运动类型的序列上的测试结果说明了该方法的有效性,对比实验结果也优于Condensation算法和退火粒子滤波。
This paper presents a method for 3D human pose estimation using shape and motion information from multiple synchronized video streams.It separates the whole human body into head,torso and limbs.The state of each part in current frame is predicted by motion information,and the shape information is used as detector for the pose.The use of complementary cues in the system alleviates the twin problem of drift and convergence to local minima,and it also makes the system automatically initialize and recover from failures.Meantime,the use of multiple data also allows us to deal with the problems due to self-occlusion and kinematic singularity.The experimental results on sequences with different kinds of motion illustrate the effectiveness of the approach,and the performance is better than the Condensation algorithm and annealing particle filter.