东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

a modified voice conversion algorithm using compressed sensing

ISSN号：0217-9776
期刊名称：声学学报(英文版)
时间：2014
页码：323-333
分类：TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术] TN912.3[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：[1]School of Communication Engineering, Hangzhou DianZi University Hangzhou 310018, [2]Inst. of Electronic and Information Engineering, Shanghai Univ. of Electric Power Shanghai 200090
相关基金：supported by the National Natural Science Foundation of China（61201301）; Program of Zhejiang Provincial Education Department（Y201016542）
相关项目：用于非对称语料的语音转换函数训练算法研究

作者：简志华|

关键词：转换算法, 语音帧, 压缩, 离散余弦变换域, 感知, 特征向量, 转换功能, 转换系统

中文摘要：

A voice conversion algorithm,which makes use of the information between continuous frames of speech by compressed sensing,is proposed in this paper.According to the sparsity property of the concatenated vector of several continuous Linear Spectrum Pairs(LSP)in the discrete cosine transformation domain,this paper utilizes compressed sensing to extract the compressed vector from the concatenated LSPs and uses it as the feature vector to train the conversion function.The results of evaluations demonstrate that the performance of this approach can averagely improve 3.21%with the conventional algorithm based on weighted frequency warping when choosing the appropriate numbers of speech frame.The experimental results also illustrate that the performance of voice conversion system can be improved by taking full advantage of the inter-frame information,because those information can make the converted speech remain the more stable acoustic properties which is inherent in inter-frames.

英文摘要：

A voice conversion algorithm,which makes use of the information between continuous frames of speech by compressed sensing,is proposed in this paper.According to the sparsity property of the concatenated vector of several continuous Linear Spectrum Pairs（LSP）in the discrete cosine transformation domain,this paper utilizes compressed sensing to extract the compressed vector from the concatenated LSPs and uses it as the feature vector to train the conversion function.The results of evaluations demonstrate that the performance of this approach can averagely improve 3.21%with the conventional algorithm based on weighted frequency warping when choosing the appropriate numbers of speech frame.The experimental results also illustrate that the performance of voice conversion system can be improved by taking full advantage of the inter-frame information,because those information can make the converted speech remain the more stable acoustic properties which is inherent in inter-frames.

同期刊论文项目