东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于声学统计建模的语音合成技术研究

ISSN号：1003-0077
期刊名称：《中文信息学报》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]中国科学技术大学讯飞语音实验室,安徽合肥230027
相关基金：高性能汉语文语转换研究（69975018）; 高表现力多语种韵律建模研究（60475015）; 结合发音动作参数的统计建模语音合成方法研究（60905010）

关键词：语音合成, 隐马尔可夫模型, 参数合成, 单元挑选, speech synthesis, hidden Markov model, parametric synthesis, unit selection

中文摘要：

该文介绍基于声学统计建模的语音合成技术,重点回顾中国科学技术大学讯飞语音实验室在语音合成领域这一前沿发展方向的创新性工作成果。具体包括：融合发音动作参数与声学参数,提高声学参数生成的灵活性;以最小生成误差准则取代最大似然准则,提高合成语音的音质;使用单元挑选与波形拼接方法取代参数合成器重构,改善参数语音合成器在合成语音音质上的不足。以上技术创新使得语音合成系统在自然度、表现力、灵活性及多语种应用等方面的性能都有进一步的提升,并推动语音合成技术在呼叫中心信息服务、移动嵌入式设备人机语音交互、智能语音教学等领域的广泛引用。

英文摘要：

This paper introduces acoustic statistical modeling based speech synthesis technologies.Emphasis is on the research progress contributed by USTC iFLYTEK speech laboratory,which includes： integrate articulatory features and acoustical features for improving the flexibility of acoustical parameters generation;propose a minimum generation error criterion to replace maximum likelihood for improving the synthesized speech quality;use unit selection and waveform concatenation to replace parametric synthesizer and avoid the limitation of speech quality in HMM based parametric synthesis.These innovative techniques improve the performance of speech synthesis systems in naturalness,expressiveness,flexibility and multilingual ability etc.These progresses have made speech synthesis technologies to be widely used in fields of information service of call center,human-machine speech interaction of mobile embedded devices and intelligent speech enabled electronic education systems.

同期刊论文项目