情感特征的提取是语音情感识别的重要方面。由于传统信号处理方法的局限,使得提取的传统声学特征特别是频域特征并不准确,不能很好地表征语音的情感特性,因而对情感识别率不高。利用希尔伯特黄变换(HHT)对情感语音进行处理,得到情感语音的希尔伯特边际能量谱;通过对不同情感语音的边际能量谱基于Mel尺度的比较分析,提出了一组新的情感特征:Mel频率边际能量系数(MFEC)、Mel频率子带频谱质心(MSSC)、Mel频率子带频谱平坦度(MSSF);利用支持向量机(SVM)对5种情感语音即悲伤、高兴、厌倦、愤怒和平静进行了识别。实验结果表明,通过该方法提取的新的情感特征具有较好的识别效果。
Emotional feature extraction plays an important role in speech emotion recognition. Due to the limitations of traditional signal processing methods, traditional phonetic features, especially in terms of frequency domain features, are unable to reflect precisely phonetic emotional characteristic, which leads to a low emotion recognition rate. This paper proposes a new method. Firstly, Hilbert-Huang Transform(HHT)is used in order to process speech signal, thus to obtain Hilbert marginal energy spectrum. Then, a comparison and relative analysis based on Mel-scale is carried out, afterwards a new array of emotional features are obtained, which consists of Mel-Frequency Marginal Energy Coefficient(MFEC), Mel-frequency Sub-band Spectral Centroid(MSSC)and Mel-frequency Sub-band Spectral Flatness(MSSF). Finally, the five kinds of speech emotion namely sadness, happiness, boredom, anger and neutral are recognized by using the Support Vector Machine(SVM). The experimental results show that the new emotional features extracted by this method have better recognition performance.