东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

基于Fisher判别字典学习的说话人识别

ISSN号：1009-5896
期刊名称：《电子与信息学报》
分类：TP391.42[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：哈尔滨工业大学计算机科学与技术学院,哈尔滨150001
相关基金：国家自然科学基金（61071181;61471145）;国家自然科学基金重大研究计划（91120303）

作者：王伟, 韩纪庆, 郑铁然, 郑贵滨, 陶耀

关键词：说话人识别, 字典学习, 稀疏表示, FISHER判别, Speaker recognition, Dictionary learning, Sparse representation, Fisher Discrimination（FD）

中文摘要：

稀疏表示已成功应用于说话人识别领域。在稀疏表示中,构造好的字典起着重要的作用。该文将Fisher准则的结构化字典学习方法引入说话人识别系统。在判别字典的学习过程中,每一个字典对应一个类标签,因此同类别训练样本的重构误差较小。同时,保证训练样本的稀疏编码系数类内误差最小,类间误差最大。在NIST SRE2003数据库上,实验结果表明该算法得到的等错误率是7.62%,基于余弦距离打分的i-vector的等错误率是6.7%。当两个系统融合后,得到的等错误率是5.07%。

英文摘要：

Motivated by the success of sparse representation in speaker recognition, a good dictionary plays an important role in sparse representation. In this paper, the structured dictionary learning is introduced to speaker recognition based on the Fisher criterion. In the process of learning the discrimination dictionary, each sub-dictionary of the learned dictionary corresponds to a class label, so the reconstruction error of the same training samples is small. Meanwhile, the sparse coding coefficients have small with-class scatter and big between-class scatter. On the NIST SRE 2003 database, the experimental results indicate that the proposed method achieves an Equal Error Rate（EER） of 7.62%, and the i-vector system based on cosine distance scoring gives an EER of 6.7%. Moreover, an EER of 5.07% is obtained by combining two systems.

同期刊论文项目

基于内在与潜在语义特征的声音段落级语义识别方法研究

期刊论文 2

行车环境听觉模型及声音处理关键技术

期刊论文 41 会议论文 55 著作 2

同项目期刊论文

采用听觉滤波器的宽带MUSIC声源定位方法

汉语和英语音高重音自动标注法方法的对比分析

Integrating Binary Mask Estimation with MRF Priors of Cochleagram for Speech Separation

Statistical voice activity detection based on sparse representation over learned dictionary

基于噪音追踪的二值时频掩蔽到实值掩蔽的泛化算法

Auditory filter based broadband MUSIC algorithm for sound source localization

Optimization of Learned Dictionary for Sparse Coding in Speech Processing

A signal subspace dimension estimator based on F-norm with application to subspace-based multi-chann

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.

The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sens

A new Bayesian method incorporating with local correlation for IBM estimation

Latent topic model for audio retrieval

A novel signal subspace dimension estimator based on F-norm with application to subspace-based multi

Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection

A new framework for robust speech recognition in complex channel environments

Noise Robust Direction of Arrival Estimation for Speech Source With Weighted Bispectrum Spatial Corr

Spectrum enhancement with sparse coding for robust speech recognition

Dictionary evaluation and optimization for sparse coding based speech processing

Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptati

Soft Margin Based Low-Rank Audio Signal Classification

行车噪声环境下基于人耳频率选择特性的声学特征提取方法

Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models

Fast Audio Retrieval Using Symbolized LSH Address Based on p-stable Distribution

基于p-稳定分布局部敏感哈希地址的鲁棒音频检索方法

融合引导概率的语音识别解码算法研究

SPARSE BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

Integrating Induced Probability into Decoding for Large Vocabulary Continuous Speech Recognition

基于深度学习语音分离技术的研究现状与进展

鲁棒声学事件检测综述

基于稀疏编码的鲁棒说话人识别

若干倍图的邻点可区别Ⅰ-全染色

期刊信息

《电子与信息学报》
中国科技核心期刊

主管单位:中国科学院
主办单位:中国科学院电子学研究所国家自然科学基金委员会信息科学部
主编：朱敏慧
地址：北京市北四环西路19号
邮编：100190
邮箱：jeit@mail.ie.ac.cn
电话：010-58887066

国际标准刊号：ISSN：1009-5896
国内统一刊号：ISSN：11-4494/TN
邮发代号:2-179

获奖情况:

国内外数据库收录:
荷兰文摘与引文数据库,美国工程索引,美国剑桥科学文摘,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）

被引量:24739