对人体运动捕获数据底层特征和高层语义之间常常存在语义鸿沟的问题,结合深度学习思想,提出一种融合受限玻尔兹曼机生成模型和判别模型的运动捕获数据语义识别算法.该算法采用双层受限玻尔兹曼机,分别对运动捕获数据进行判别性特征提取(特征提取层)和风格识别(语义判别层),首先考虑到自回归模型对时序信息具有出色的表达能力,构建一种基于单通道三元因子交互的条件限制玻尔兹曼机生成模型,用于提取运动捕捉数据的时空特征信息;然后将提取出的特征与对应的风格标签相耦合,作为语义判别层中受限玻尔兹曼机判别模型的当前帧数据层输入,进行单帧风格识别的训练;最后在获得各帧参数的基础上,在模型顶部加入投票空间实现对运动捕捉序列的风格语义的有效识别.实验结果表明,文中算法具有良好的鲁棒性和可扩展性,能够满足多样化运动序列识别的需求,便于数据的有效重用.
The semantic gap problem between the low-level features and high-level semantics often existswithin the motion capture data.To tackle this problem,we refer to the deep learning theory and propose atwo-layer motion recognition approach by fusing the Restricted Boltzmann Machine(RBM)based generativemodel and discriminative model,in which the generative layer is utilized for feature representation andthe discriminative layer is selected for semantic discrimination.Within the proposed approach,we first utilizethe autoregressive model to establish an one-way three-factored conditional RBM,whereby the spatiotemporalfeatures of the captured motions can be well obtained.Then,these features are coupled with theircorresponding labels and selected as the visible input of the RBM based discriminative model.Finally,byadding a voting space,the motion semantics can be efficiently recognized via this two layer fused model.The experimental results have shown that our proposed approach is able to recognize different kinds of motionposes,featuring robustness and expandability to the motion capture data.It is expected that the proposedapproach would be well utilized for motion capture data reusing in a practical way.