位置:成果数据库 > 期刊 > 期刊详情页
Audio-visual underdetermined blind source separation algorithm based on Gaussian potential function
  • ISSN号:1673-5447
  • 期刊名称:China Communications
  • 时间:2014.6.22
  • 页码:71-80
  • 分类:O11[理学—数学;理学—基础数学] TP391.4[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China, [2]National Engineering Laboratory for Disaster Backup and Recovery, Beijing University of Posts and Telecommunications, Beijing 100876,China
  • 相关基金:supported by the National Natural Science Foundation of China(Grant Nos.61162014,61210306074);the Natural Science Foundation of Jiangxi Province of China(Grant No.20122BAB201025);the Foundation for Young Scientists of Jiangxi Province(Jinggang Star)(Grant No.20122BCB23002)
  • 相关项目:基于视听觉信息融合的欠定卷积语音混合信号盲分离及其在机器人听觉系统中应用的研究
中文摘要:

Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference(ITD) and the interaural level difference(ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.

英文摘要:

Most existing algorithms for the underdetermined blind source separation (UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference (ITD) and the interaural level difference (ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.

同期刊论文项目
期刊论文 5 会议论文 2
同项目期刊论文
期刊信息
  • 《中国通信:英文版》
  • 中国科技核心期刊
  • 主管单位:中国科学技术协会
  • 主办单位:中国通信学会
  • 主编:刘复利
  • 地址:北京市东城区广渠门内大街80号6层608
  • 邮编:100062
  • 邮箱:editor@ezcom.cn
  • 电话:010-64553845
  • 国际标准刊号:ISSN:1673-5447
  • 国内统一刊号:ISSN:11-5439/TN
  • 邮发代号:2-539
  • 获奖情况:
  • 国内外数据库收录:
  • 被引量:187