人体动作识别是计算机视觉中一个流行而且重要的研究课题.当观察视角发生变化时,动作识别变得格外困难.至今为止,关于动作识别和手势识别的大多数研究工作都是围绕着视角相关的表达展开的.有一小部分利用了视角不变的表示开展研究,可是它们大多数存在一些缺陷,比如缺少用于识别的足够信息,依赖鲁棒的语义特征点的检测或者是点对应.为了解决这个问题,实现视角无关、动作人无关的动作识别,提出了“包容形状”的表示,这种表示不依赖于特定视角.在人体动作识别中,人的身体旋转通常是引起视角变化的主要原因.包容形状充分利用了两个正交摄像机拍摄的轮廓信息以去除由人的身体旋转产生的影响.从来自两个正交的摄像机拍摄的外轮廓,可以很容易计算得到包容形状,利用包容形状的体态表示和隐马尔可夫模型,取得了非特定人、任意视角下动作识别的很好的实验结果.这些实验结果也表明了包容形状包含有足够区分度的信息.同时提出了包容形状的扩展表示,以便在两个摄像机并不完全正交的更为一般的摄像机配置条件下也可以应用,这极大地加强了其实用价值.
Action recognition is a popular and important research topic in computer vision. However, it is challenging when facing viewpoint variance. So far, most researches in action recognition remain rooted in view-dependent representations. Some view invariance approaches have been proposed, but most of them suffer from some weaknesses, such as lack of abundant information for recognition, dependency on robust meaningful feature detection or point correspondence. To perform viewpoint and subject independent action recognition, this paper proposes a representation called "Envelop Shape" which is viewpoint insensitive. "Envelop Shape" is easy to acquire from silhouettes using two orthogonal cameras. It makes full use of two cameras' silhouettes to dispel influence caused by human body's vertical rotation, which is often the primary viewpoint variance. With the help of "Envelop Shape" representation and Hidden Markov Model, the inspiring results on action recognition independent of subject and viewpoint are obtained. Results indicate that "Envelop Shape" representation contains enough discriminating features for action recognition. Extension of "Envelop Shape" is also proposed to make it run under fewer restrictions of camera configurations, which increases its application value effectively.