东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于时频分布与MFCC的说话人识别

ISSN号：1003-3254
期刊名称：计算机系统应用
时间：2012
页码：189-192+178
分类：TN912.34[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：[1]江南大学物联网工程学院,无锡214122
相关基金：国家自然科学基金（61075008）
相关项目：汉语语音信号的时频感知新特征提取的研究

关键词：短时傅里叶变换, WIGNER-VILLE分布, Choi-Williams分布, MEL频率倒谱系数, 说话人识别, STFT, WVD, CWD, MFCC, speaker recognition

中文摘要：

针对MFCC不能得到高效的说话人识别性能的问题,提出了将时频特征与MFCC相结合的说话人特征提取方法。首先得到语音信号的时频分布,然后将时频域转换到频域再提取MFCC＋MFCC作为特征参数,最后通过支持向量机来进行说话人识别研究。仿真实验比较了MFCC、MFCC＋MFCC分别作为特征参数时语音信号与各种时频分布的识别性能,结果表明基于CWD分布的MFCC和MFCC的识别率可提高到95.7%。

英文摘要：

Because MFCC can＇t reflect the dynamic characteristics of speech signal and their own non-stationary, a feature extraction method by combining time-frequency distribution with MFCC is proposed. First get time-frequency distribution of speech signal, and convert time-frequency domain into frequency domain, then extract MFCC＋MFCC as characteristic parameters. Finally speaker recognition uses the support vector machine. The simulation experiment compares recognition performance when MFCC and MFCC＋MFCC are respectively as characteristic parameters by speech signal and all kinds of time-frequency distribution. Results show that the speaker recognition performance using MFCC＋MFCC based on the CWD time-frequency distribution can be improved to 95.7%.

同期刊论文项目