提出了一种利用经验模态分解(Empirical Mode Decomposition,EMD)和加权Mel倒谱(Weighted Mel-Cepstrum coefficients,WMCEP)提取语音信号共振峰的算法。对语音信号进行EMD分解,找出含有共振峰的固有模态函数(Intrinsic Mode Function,IMF),并将其重构得到一个新的重构语音信号。对重构语音信号进行加权Mel倒谱分析,获得包含频谱主要成分的加权Mel倒谱系数;利用离散余弦平滑算法,从加权Mel倒谱系数获得谱包络,并从谱包络的峰值位置获得候选共振峰;根据共振峰的连续性约束条件和频率范围,从候选共振峰筛选得到共振峰的估计值。实验结果表明,该算法比单独使用WMCEP提取的共振峰误差更小,而且在信噪比小于20 dB时仍然能够准确提取出共振峰。
This paper presents a method to realize formants extraction from speech signal. The speech signal is decom-posed with Empirical Mode Decomposition(EMD)to obtain a set of formant-specific Intrinsic Mode Functions(IMF). The new speech signal is then generated by adding the IMFs. The Weighted Mel-Cepstrum Coefficients(WMCC), which contain main components of spectrum, are calculated from the new speech signal by using weighted mel-cepstrum analysis. The Discrete Cosine Transform(DCT)based smooth algorithm is then applied to the WMCCs to obtain the smooth con-tour of spectrum in which the peaks of contour are candidate formants. The formant frequencies are selected from candidate formants according to the continuity constrain and the frequency range of formants. Tests show that the errors of this method outperform the weighted mel-cepstrum based method. When signal-to-noise ratio is less than 20 dB, the proposed method still can accurately extract the formants.