在说话人识别研究中,基于身份认证向量(Identity vector,IVEC)的说话人建模方法可以有效地提取说话人信息,是目前处于国际前沿的建模方法.本文对身份认证向量后接支持向量机(Identity vector followed by support vector machine,IVEC—SVM)的说话人识别系统进行了研究,对比了该系统在十种不同核函数下的识别性能、并与文献中身份认证向量后接余弦距离打分(Identity vector followed by cosine distance scoring,IVEC—CDS)系统进行了比较.在美国国家标准技术局(American National Institute of Standards and Technnogy,NIST)组织的2010年电话信道——电话信道说话人识别核心评测数据库上的实验结果显示,基于核函数的IVEC—SVM系统性能明显优于IVEC—CDS的系统性能.此外,实验结果表明基于Spline核的IVEC—SVM系统可取得最好的识别性能,与IVEC—CDS系统相比,其等错点(Equalerrorrate,EER)在分数归一化前后分别降低了10%和3%.
In the text-independent speaker recognition re- search area, identity vector (IVEC) based modeling has been recently proved to be the most efficient method of extracting speaker information. This paper explores and compares the performances of ten different kernel functions in identity vecw tor followed by support vector machines (IVEC-SVM) system and identity vector followed by cosine distance scoring (IVEC- CDS). Experiments corpora the speaker recognition evaluation data, telephone-telephone corpus released by American National Institute of Standard and Technology (NIST) in 2010, demon- strate that the kernel function based IVEC-SVM system per- forms better than the IVEC-CDS system. Among all the kernel function based IVEC-SVM systems, the spline kernel function performs the best, and it has relative decreases of 10 % and 3 % in EER compared to the IVEC-CDS system before and after doing score normalization, respectively.