目的心率是直接反映人体健康的重要指标之一,基于视频的非接触式心率检测在医疗健康领域具有广泛的应用前景。然而,现有的基于视频的方法不适用于复杂的现实场景,主要原因是没有考虑视频中目标晃动干扰和空间尺度特征,使得血液容积脉冲信号提取不准确,检测精度不尽人意。为了克服以上缺陷,提出一种抗人脸晃动干扰的非接触式心率检测方法。方法本文方法主要包含3个步骤:首先,针对目标晃动干扰人脸区域选择的问题,利用判别响应图拟合检测参考图像的人脸区域及主要器官特征点,在人脸跟踪时首次引入倾斜校正思想,输出晃动干扰抑制后的人脸视频;然后,结合空间尺度的差异,采用颜色放大方法对晃动干扰抑制后的人脸视频进行时空处理,提取干净的血液容积脉冲信号;最后,考虑到小样本问题,通过傅里叶系数迭代插值的频域分析方法估计心率。结果在人脸静止的合作情况以及人脸晃动的非合作情况下采集视频,对心率检测结果进行定量分析,本文方法在两种情况下的准确率分别为97.84%和97.30%,与经典和最新的方法相比,合作情况准确率提升大于1%,非合作情况准确率提升大于7%,表现了出色的性能。结论提出了一种基于人脸视频处理的心率检测方法,通过有效分析人脸的晃动干扰和尺度特性,提取到干净的血液容积脉冲信号,提高了心率检测的精度和鲁棒性。
Objective Heart rate is one of the important indicators that can directly reflect the health of the human body. Heart rate detection has been applied to many aspects of the medical field, such as physical examination, major surgery, and postoperative treatment. Heart rate detection based on face video processing has recently been performed through a non- contact manner without complex operations and sense of restraint. However, the existing methods cannot predict well in complex realistic scenes, including shaking target. If face detection in video processing is accompanied with face shaking, the facial region of interest is selected inaccurately. Such methods also disregard spatial scale features, which are significant to extract blood volume pulse (BVP) signal. The results of current methods are consequently inadequate. To this end, a new non-contact heart rate detection method based on face video processing is proposed to reduce the influence of face shake and improve precision. Method Our method consists of three major steps. First, we deal with video through a robust face detecting and tracking model to obtain a refined face video in which facial shake is eliminated. Considering that the univer- sal Viola-Jones face detection model generates an incorrect face area when a face is tilted along consecutive frames, dis- criminative response map fitting is used to detect important feature points for tracking the right face area. For the first frame image, we mark 66 landmark points on the facial organ (eyes, nose, mouth, and facial shape) and four vertexes of facial rectangle. These feature points are then entered into the Kanade-Lucas-Tomasi tracking model to calculate the facial rectan- gle of subsequent frames. According to the oblique angle of each facial rectangle, the corresponding face image is rotated to a vertical position. Second, the modified face video is handled by a space-time processing algorithm for amplifying the video color variations to separate the spatial scale characteristics of the