It is often necessary to recognize human mouth-states for detecting the number of audio sources and improving the speech recognition capability of an intelligent robot auditory system. A human mouth-state recognition method based on image warping and sparse representation( SR) combined with homotopy is proposed.Using properly warped training mouth-state images as atoms of the overcomplete dictionary overcomes the impact of the diversity of the mouths’ scales,shapes and positions so that further improvement of the robustness can be achieved and the requirement for a large number of training samples can be relieved. The homotopy method is employed to compute the expansion coefficients effectively,i. e.,for sparse coding. The orthogonal matching pursuit( OMP) is also tested and compared with the homototy method. Experimental results and comparisons with the state-of-the-art methods have proved the effectiveness of the proposed approach.
It is often necessary to recognize human mouth-states for detecting the number of audio sources and improving the speech recognition capability of an intelligent robot auditory system. A human mouth-state recognition method based on image warping and sparse representation( SR) combined with homotopy is proposed.Using properly warped training mouth-state images as atoms of the overcomplete dictionary overcomes the impact of the diversity of the mouths' scales,shapes and positions so that further improvement of the robustness can be achieved and the requirement for a large number of training samples can be relieved. The homotopy method is employed to compute the expansion coefficients effectively,i. e.,for sparse coding. The orthogonal matching pursuit( OMP) is also tested and compared with the homototy method. Experimental results and comparisons with the state-of-the-art methods have proved the effectiveness of the proposed approach.