传统的加权频率卷绕算法是单独地对每帧语音特征参数进行转换,没有考虑到语音帧前后的相关信息。针对这一点,该文提出了一种改进的加权频率卷绕算法,它利用压缩感知理论提取语音信号的帧间相关信息。在进行转换时,该算法是相当于对语音段进行转换。客观测试和主观听觉评测表明,虽然改进后算法的性能会受到语音段长度的影响,但当选择合适语音段长度时,性能要好于传统的加权频率卷绕算法。
The traditional conversion algorithm, weighted frequency warping( WFW), converted the speaker identity feature frame-by-frame and did not take account of the contextual information existing over a speech sequence. To solve the problem, this paper proposed a modified version of the WFW called modified weighted frequency warping(MWFW) which utilized compressed sensing(CS) to capture the useful information between continuous frames. Instead of transforming the speech features frame-independently, the MWFW did it seg- ment-by-segment. Both object and subject evaluations were conducted. The experimental results demonstrated that the performance of MWFW was dependent on the length of speech segment. When choosing the appropri- ate length of speech segment, our approach can achieve better performance than WFW.