基于增强型混合激励线性预测模型,提出一种高质量的300 bit/s声码器算法。每个语音帧仅提取少量参数,为提高量化效率,每8个语音帧组成一个超级帧,对超级帧参数进行矢量量化。算法采用基于模式转移的码本映射估计带通浊音度参数,改善其量化精度。对不同带通浊音度模式下的基音参数量化码本尺寸进行联合优化,提高量化效率。同时,对线谱频率参数采用带有级间预测的多级矢量量化以降低谱失真。主观听觉测试表明,此声码器具有较高的可懂度并具有一定的自然度,诊断押韵测试(DRT)的分数为84.2%。
A vocoder to obtain high quality synthetic speech at 300 bit/s is presented based on mixed excitation linear prediction(MELP) which extracts only few parameters each frame.To obtain high quantization efficiency,vector quantization is performed on parameters of the super-frame composed by eight frames.The quantization efficiency of band pass voicing coefficients(BPVC) is improved based on estimation using code-book mapping over mode transition.Codebook sizes of pitch parameter for different unvoiced/voiced model are jointly optimized to improve the quantization efficiency.Meanwhile,multi-stage vector quantization with inter-stage prediction is perfor-med for linear spectral frequency parameters(LSF) to reduce the spectral distortion.Simulation results show that the intelligibility of this 300 bit/s vocoder is quite good and the natural tone is fine.The diag-nostic rhyme test(DRT) score is 84.2%.