声带准周期振动的缺失,使得汉语耳语音成为了一种特殊的发音模式,也使得耳语声调无法用基音周期表征。目前用于语音识别和声纹识别的常规语音特征,包含声调信息较少,所以在声调识别实验中很难获得良好的效果。本文提出一种新的特征参数来模拟正常语音的基频声调轨迹,即以人的听觉特性为出发点,研究人的声调敏感Bark频带,发现部分扩散Bark谱能量归一化比例拟合曲线,能够呈现出类似正常语音的基频轨迹,这说明在某些方面该轨迹或多或少包含了耳语音的声调信息。在以该轨迹和语音短时能量曲线为特征,以神经网络为模型的耳语声调识别实验中获得了较高的识别正确率,汉语四声的总体识别正确率高达78%,这也为对耳语音的进一步处理提供了很多有力依据。
In terms of the special pronunciation,the pitch frequency,which is the tone carrier of the whispers,is lost.As a tone language,speaker's meaning may be mostly expressed through the tone of mandarin.So the tone character extracting is the important step of whisper speech processing.The traditional characters,which are used in speech recognition or speaker identification,mostly contained the voice meaning or speaker's information,so few conventional characters is suitable for whisper tong recognition experiments. A new parameter which may express the whisper tone is discovered during many analysis experiments.Tone information is not a strong signal,so it may not be showed in full frequency domain.Based on the human's auditory ability,whisper tone information may be delivered through some of the sensitive bark band.The fitting curve of energy proportion of diffused Bark spectrum can replace the pitch frequency track of normal speech to some extent,and it can be the new carrier of whisper tone information.The average correct rate is 78%in mandarin tone recognition experiments,when use the fitting curve and the short-time energy as the characters,and the neural network as the model.And it provides the foundation of deeper study in whisper speech processing field.