在VoIP说话人识别中,当使用原始语音(未经过编译码处理)训练的说话人模型识别经过语音编译码处理的测试语音时,系统的识别性能会发生下降。本文给出了一种基于统计匹配和EM(期望最大化)算法的VoIP说话人特征(12阶的LPCC系数)补偿算法,其中对假设失真特征与未失真识别特征间符合非线性(二次函数型)和线性函数关系时的函数参数进行了估计,并使用得到的补偿函数对失真特征进行补偿。实验结果表明,该特征补偿算法对VoIP中广泛使用的G.7298kb/s、G.723.16.3kb/s、G.723.15.3kb/s编译码所造成的识别性能下降有较大的改善,其性能也优于CMS(倒谱均值减)方法。
The performance of VolP based speaker recognition system can be severely degraded by the mismatch caused by the speech coded and decoded process. We design a feature compensation algorithm based on the stochastic matching and EM algorithm which estimates the function of the linear and 2^nd order form between the distorted features and the initial ones. The experimental results show that the new feature compensation method can improve the performance of VoIP based speaker recognition system suffered badly with the mismatch through the G. 729 8 kb/s ,G723.1 6.3 kb/s,G723.1 5.3 kb/s speech coded and decoded process. The new method, is also more effective than the wildly used CMS method.