目的 人类行为识别是计算机视觉领域的一个重要研究课题.由于背景复杂、摄像机抖动等原因,在自然环境视频中识别人类行为存在困难.针对上述问题,提出一种基于显著鲁棒轨迹的人类行为识别算法.方法 该算法使用稠密光流技术在多尺度空间中跟踪显著特征点,并使用梯度直方图(HOG)、光流直方图(HOF)和运动边界直方图(MBH)特征描述显著轨迹.为了有效消除摄像机运动带来的影响,使用基于自适应背景分割的摄像机运动估计技术增强显著轨迹的鲁棒性.然后,对于每一类特征分别使用Fisher Vector模型将一个视频表示为一个Fisher向量,并使用线性支持向量机对视频进行分类.结果 在4个公开数据集上,显著轨迹算法比Dense轨迹算法的实验结果平均高1%.增加摄像机运动消除技术后,显著鲁棒轨迹算法比显著轨迹算法的实验结果平均高2%.在4个数据集(即Hollywood2、YouTube、Olympic Sports和UCF50)上,显著鲁棒轨迹算法的实验结果分别是65.8%、91.6%、93.6%和92.1%,比目前最好的实验结果分别高1.5%、2.6%、2.5%和0.9%.结论 实验结果表明,该算法能够有效地识别自然环境视频中的人类行为,并且具有较低的时间复杂度.
Background In the past few years, we have witnessed a great success of social networks and multimedia technolo- gies, leading to the generation of vast amount of Interact videos. To organize these videos and to provide value-added services to users, human activities from videos should be automatically recognized. A number of research studies have focused on this challenging topic. Objective Human action recognition is a significant research topic in computer vision. The recognition of human actions from unconstrained videos is difficult because of complex background and camera motion. A robust and salient trajectory-based approach is proposed to address such problem. Method Dense optical flow is utilized to track the scale invar- iant feature transform keypoints at multiple spatial scales. The histogram of oriented gradient, histogram of optical flow, and motion boundary histogram are employed to depict the trajectory efficiently. To eliminate the influence of camera motions, a camera motion estimation approach based on adaptive background segmentation is utilized to improve the robustness of trajec- tory. The Fisher vector model is utilized to compute one Fisher vector over a complete video for each descriptor separately, and the linear support vector machine is enaployed for classification. Result The average improvement of salient trajectory al- gorithm over dense trajectory algorithm is 1% on four challenging datasets. After utilizing the camera motion elimination ap- proach, the average experimental result over salient trajectory is improved by 2%. The state-of-the-art results on four datasets (i. e., Hollywood2, ~ouTube, Olympic Sports and UCF50), the proposed algorithm obtains 65.8%, 91.6% , 93.6%, 92. 1% , and the state-of-the-art results have been improved by 1.5% , 2. 6% , 2. 5% , 0. 9% respectively. Conclusion Experi- mental results on four challenging datasets demonstrate that the proposed algorithm can effectively recognize human actions from unconstrained videos in a more computationally efficie