提出一种结合姿态特征和场景信息对图像中的人体行为进行分类的方法,采用多尺度密集采样和SIFT特征对图像进行特征提取和描述,以非参数概率密度估计方法对特征空间的样本分布进行估计,并对概率密度梯度向量在码本单词上的聚集进行描述得到紧凑且有判别力的场景编码.姿态分类则利用人体部位的表观和配置关系从图像中提取出与特定行为类别相关的姿态特征,利用最大分类间隔姿态分类器计算得到每个测试样本属于各个行为类别的评分值.最后结合姿态分类器和行为场景分类器两种分类器输出值完成对测试样本的分类.将本文的方法运用于Willow-actions数据集上进行评价,实验结果证明了该方法的有效性.
This paper proposes a novel method for action recognition in still images using a combination of poselet and scene information. First ,multi-scale dense sampling and SIFT descriptor are applied in feature extraction and description. Then non-parametric probability density estimation is employed to estimate the spatial distribution of feature space. To obtain discriminative scene feature,the gradient of probability density function is calculated and the vectors aggregation on visual words of action codebook is described for scene based action classification. While for pose classification,action-specific appearance and configuration patterns of human body part are extracted from training images, then a set of pose classifiers are trained to evaluate the class label confidence which test samples belongs to. Finally, the outputs of scene classifier and pose classifier are combined to decide the final class label. We validate our approach on Willow-action dataset and experimental results show that it achieves superior performance in comparison to several baseline methods.