在多媒体会议房间中,鼓掌、咳嗽等非高斯干扰噪声常会严重影响语音处理系统的性能。为了有效地抑制非高斯干扰噪声,本文提出了一种基于线性预测残差域高阶统计量的语音VAD检测方法。该方法利用语音信号线性预测残差的归一化峰度表征语音和非语音信号在谐波数量上的差异,构造判别准则进行VAD检测,并通过预估高斯背景噪声的能量,削弱了背景噪声对VAD算法性能的影响。仿真实验结果表明,该方法能够有效地区分高斯背景噪声下的语音和非高斯噪声。
In multimedia systems, the performance of speech signal processing is often degraded by non-Gaussian noises such as applause, cough. This paper proposes a voice activity detection(VAD) method based on higher order statistics of linear prediction residual under non-Gaussian noises. This method utilizes the normalized Kurtosis (NK) statistics of linear prediction residual as the determination criterion to perform VAD detection , because the NK criterion can characterize the numbers of harmonics of the signal and there is apparent difference between the number of harmonics speech and non-Gaussian noise. And the method reduces the effect of gaussian background noise by estimating the energy of background noise in advance. The simulation resuhs show that the proposed method can effectively distinguish speech from non-Gaussian noise under guassian background noise.