东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

Whispered speaker identification based on feature and model hybrid compensation

期刊名称：Chinese Journal of Acoustics
时间：2012
页码：499-508
分类：TN713.7[电子电信—电路与系统] TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]School of Physical Science and Technology ~ School of Energy, Soochow University Suzhou 215006, [2]School of Electronics and Information Engineering, Soochow University Suzhou 215006
相关基金：This work was supported by the National Natural Science Foundation of China （61271359, 61071215）, Suzhou Science and Technology Development Plan （SYG201001）, Key Joint Laboratory of Soochow University and JieMei Biomedical Engineering Instrument.
相关项目：基于JFA的耳语发音方式下说话人识别研究

关键词：双线性变换法, 语音转换, 耳语音, 频谱失真, 转换功能, 膨胀系数, 共振峰, 非线性

中文摘要：

A method of conversion from whispered speech to normal speech using the extended bilinear transformation was proposed.On account of the different deviation degrees of the whisper’s formants in different frequency bands,the spectrum of the whispered speech will be processed in the separate partitions of this paper.On the basis of this spectrum,we will establish a conversion function able to usefully convert whispered speech to normal speech.Because of the whisper’s non-linear offset in relation to normal speech,this paper introduces an expansion factor in the bilinear transform function making it correspond more closely to the actual conversion demands of whispered speech to normal speech.The introduction of this factor takes the non-linear move of the spectrum and the compression of the formant bandwidth into consideration,thus effectively reducing the spectrum distortion distance in the conversion.The experiment results show that the conversion presented in this paper effectively improves both the sound quality and the intelligibility of whispered speech.

英文摘要：

A method of conversion from whispered speech to normal speech using the extended bilinear transformation was proposed. On account of the different deviation degrees of the whisper＇s formants in different frequency bands, the spectrum of the whispered speech will be processed in the separate partitions of this paper. On the basis of this spectrum, we will establish a conversion function able to usefully convert whispered speech to normal speech. Because of the whisper＇s non-linear offset in relation to normal speech, this paper introduces an expansion factor in the bilinear transform function making it correspond more closely to the actual conversion demands of whispered speech to normal speech. The introduction of this factor takes the non-linear move of the spectrum and the compression of the formant bandwidth into consideration, thus effectively reducing the spectrum distortion distance in the conversion. The experiment results show that the conversion presented in this paper effectively improves both the sound quality and the intelligibility of whispered speech.

同期刊论文项目