位置:成果数据库 > 期刊 > 期刊详情页
基于粒子群优化-模糊聚类的说话人识别
  • 期刊名称:薛丽萍,尹俊勋,纪震, 基于粒子群优化-模糊聚类的说话人识别, 深圳大学学报理工版, 2008. V
  • 时间:0
  • 分类:TN912.3[电子电信—通信与信息系统;电子电信—信息与通信工程] TP18[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
  • 作者机构:[1]华南理工大学电子与信息学院,广州510640, [2]深圳大学软件学院,深圳518060
  • 相关基金:国家自然科学基金资助项目(60572100);深圳大学科研启动基金资助项目(200637)
  • 相关项目:粒子群优化算法的研究及其在图象压缩编码中的应用
中文摘要:

基于粒子群优化(particle swarm optimization,PSO)提出一种说话人识别算法—三粒子模糊C均值聚类算法,利用3个子群体,每个子群体由规模较小的3个粒子构成,寻求最佳说话人模型.在每次迭代中每个子群体按先后川页序执行PSO算法中的速度更新、位置更新操作和标准FCM算法,对说话人的训练语音数据进行粒子群优化-模糊的软聚类分析,得到聚类中心的最优解,作为该说话人的语音模型.此算法可避免粒子陷入局部最优聚类中心,较准确地记录和估计每个聚类中心的最佳移动方向和历史路径,从而使聚类中心向全局最优解靠近.实验表明,本算法始终稳定地取得优于LBG算法、FCM算法和FRLVQFVQ算法的说话人识别性能,对初始聚类中心依赖度低,可有效降低误识率.

英文摘要:

A new strategy for speaker recognition, triple-particle fuzzy C-means clustering (FCM), called TPFCM, was proposed. Three particle sub-swarms were used to search for the best speaker model based on conventional particle swarm optimization (PSO) algorithm, and the three particles were combined into a triple-particle in each sub-swarm. At each iteration, the triple-particle performed the basic PSO operations and the conventional FCM algorithm in sequence. The speakers' training data were clustered softly, and the best clustering centers were organized as the model of the speaker. This strategy prevented the particle from being trapped in a local optimum, memorizes and estimates the best direction the particle moves toward to the optimum clustering centers. Experimen- tal results demonstrate that the performance of this new strategy is much better than that of LBG, FCM, FRLVQFVQ consistently with lower speaker recognition error rates, and the dependence of the final optimum clustering solution on the selection of the initial clustering centers is reduced effectively.

同期刊论文项目
同项目期刊论文