为了实现手语视频中手语字母的准确识别,提出一种基于DI_CamShift(depth image CamShift)和手语视觉单词(sign language visual word,SLVW)特征结合的算法.首先,采用Kinect获取手语字母手势视频及其深度信息;其次,通过计算获得深度图像中手语手势的主轴方向角和质心位置,计算搜索窗口对手势跟踪;再次,使用基于深度积分图像的大津法(OTSU)分割手势并提取其尺寸不变特征转换(scale invariant feature transform,SIFT)特征和Gabor特征,并通过典型相关分析(canonical correlation analysis,CCA)方法进行特征融合;最后,构建SLVW词包并用支持向量机(support vector machine,SVM)进行识别,单个手语字母最高识别率为99.89%,平均识别率为96.34%.
To realize the accurate recognition of manual alphabets in the sign language video, this paper presentes an improved algorithm based on DI_CamShift (depth image CamShift) and SLVW (sign language visual word) multi-features combine. First, the video and depth image information of sign language gestures are obtained by usesing Kineet. Second, the spindle direction angle and mass center position of the depth images are calculated to adjust the search window and for gesture tracking. Third, an OTSU algorithm based on depth integral image is used to gesture segmentation, then both SIFT and Gabor features are extracted and fused by the CCA. Finally, the SLVW bag of words is built, and SVM is used for recognition. The best recognition rate of single manual alphabet can reach 99.89% , and the average recognition rate is 96. 34%.