头部姿态估计是识别用户视觉注意力目标的主要依据.但在实际应用场合下,大范围头部姿态、低分辨率图像以及光照变化等因素使得可靠、准确的头部姿态估计难以实现.针对这些困难,提出一种基于动态贝叶斯网模型的视觉注意力目标识别方法.通过人脸图像与多个人脸姿态类别的相似度向量对头部姿态进行度量而不是显式的计算具体姿态值.模型融合多注意力目标、多用户位置、多摄像机图像等因素间的概率依赖关系并进行联合推理.智能厨房原型环境下的实验结果表明提出的模型是有效的.
Visual focus of attention recognition is usually based on head pose estimation.However,in a real application,it is difficult to accurately estimate the head pose due to large pose variations,low resolution images and varying illuminations.To handle the problem,we propose a dynamic Bayesian network model to infer the visual focus of attention.The head pose is not explicitly computed but measured by a similarity vector which represents the likelihoods of multiple face pose clusters.The model encodes the probabilistic relations among multiple foci of attention,multiple user locations and faces captured by multiple cameras.Data are collected in a prototype ambient kitchen and results show that the model is effective.