为提高语音端点检测系统在低信噪比下检测的准确性,提出了一种基于倒谱特征和谱熵的端点检测算法。首先,根据分析得到待测语音帧的倒谱特征量,然后计算该特征量分别在通过训练得到的语音和噪声的高斯混合模型下的似然概率,通过两者概率的比较作出有声无声初判决;联合能量熵端点检测结果得到最终判决,最后通过Hangover机制最大限度的保护了语音。实验结果表明,此方法改善了能量熵端点检测法在babble噪声下的劣势,且在不同噪声环境下均优于G.729AnnexB的性能。
In order to improve the accuracy of Voice Activity Detection(VAD) in low SNR noisy environments, an algorithm based on Linear Predictive Cepstral Coefficient (LPCC) and energy entropy is proposed. First, the LPCC extracted from the input speech is imported into speech model and noise model, both of which are Gans- sian Mixture Model (GMM) separately, to calculate the likelihood ratio of speech to noise. The first-stage VAD decision is made based on the likelihood ratio. Then the spectrum entropy is applied to the second decision- making stage. Finally, a mechanism called Hangover is used to better protect the speech. Experiment results show that the new algorithm can compensate the drawbacks of spectrum entropy method in babble noisy environ- ment. Furthermore, it outperforms the G. 729 Annex B under various noisy environments.