东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

融合引导概率的语音识别解码算法研究

期刊名称：声学学报(中文版)
时间：2012.3
页码：209-217
分类：TN912.34[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：[1]中国科学院自动化研究所模式识别国家重点实验室,北京100190
相关基金：国家重点基础研究发展计划（973计划）（2004CB318105）、国家高技术研究发展计划（863计划）（20060101Z4073,2006AA012194）和国家自然科学基金（90820011,60675026,90820303）资助项目.
相关项目：行车环境听觉模型及声音处理关键技术

作者：杨占磊|刘文举|晁浩|

关键词：语音识别系统, 概率模型, 解码算法, 引导, 位置信息, 特征空间, 局部空间, 语音帧, Computational linguistics, Continuous speech recognition, Probability

中文摘要：

语音帧在声学特征空间中的位置信息可以辅助解码器对潜在路径进行筛选。传统的语音识别系统缺乏利用这种位置信息。针对这种不足，本文提出一种引导概率模型，用于描述语音帧属于声学特征空间不同局部的概率，并将其用于识别。使用引导概率后，解码器更强调对声学特征空间中最有希望的局部进行搜索，保留并扩展通过此局部空间的路径，同时弱化不经过此局部空间的路径。实验结果显示，融合引导概率的解码算法在不显著增加解码复杂度的情形下，使汉字相对错误率下降10．95％。结果分析表明，融合了语音帧声学位置信息的解码方法能够更有效地鉴别潜在路径，从而降低误识率。

英文摘要：

This paper integrates location information of frames into conventional acoustic model （AM） and language model （LM） likelihoods, in order to distinguish potential path candidates more precisely at decoding stage. This paper proposes an induced probability, which represents location information of frames within the whole acoustic space. By integrating the induced probability, the decoder is directed to search within the most promising regions of acoustic space. Promising paths are enhanced and unlikely paths are weakened. Experiments conducted on Chinese Putonghua show that the character error rate is reduced by 10.95% relatively without increasing decoding complexity significantly. Finally, pruning analysis shows that integrating location information of frames into traditional decoding framework is helpful for improving system performance.

同期刊论文项目

基于语音知识和全局最优准则指导的段模型汉语LVCSR方法研究

期刊论文 21 会议论文 25 获奖 2 专利 3

对话管理为中心的双向多模态口语人机交互研究

期刊论文 40 会议论文 34

行车环境听觉模型及声音处理关键技术

期刊论文 41 会议论文 55 著作 2

基于客观质量评估和音频场景分析语音分离新方法研究

期刊论文 36 会议论文 22 获奖 2

同项目期刊论文

采用听觉滤波器的宽带MUSIC声源定位方法

汉语和英语音高重音自动标注法方法的对比分析

Integrating Binary Mask Estimation with MRF Priors of Cochleagram for Speech Separation

Statistical voice activity detection based on sparse representation over learned dictionary

基于噪音追踪的二值时频掩蔽到实值掩蔽的泛化算法

Auditory filter based broadband MUSIC algorithm for sound source localization

Optimization of Learned Dictionary for Sparse Coding in Speech Processing

A signal subspace dimension estimator based on F-norm with application to subspace-based multi-chann

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.

The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sens

A new Bayesian method incorporating with local correlation for IBM estimation

Latent topic model for audio retrieval

A novel signal subspace dimension estimator based on F-norm with application to subspace-based multi

Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection

A new framework for robust speech recognition in complex channel environments

Noise Robust Direction of Arrival Estimation for Speech Source With Weighted Bispectrum Spatial Corr

Spectrum enhancement with sparse coding for robust speech recognition

Dictionary evaluation and optimization for sparse coding based speech processing

Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptati

Soft Margin Based Low-Rank Audio Signal Classification

行车噪声环境下基于人耳频率选择特性的声学特征提取方法

Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models

Fast Audio Retrieval Using Symbolized LSH Address Based on p-stable Distribution

基于p-稳定分布局部敏感哈希地址的鲁棒音频检索方法

SPARSE BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

Integrating Induced Probability into Decoding for Large Vocabulary Continuous Speech Recognition

基于深度学习语音分离技术的研究现状与进展

鲁棒声学事件检测综述

基于稀疏编码的鲁棒说话人识别

基于Fisher判别字典学习的说话人识别

采用听觉滤波器的宽带MUSIC声源定位方法

Layer-based dependency parsing

基于双向标注融合的汉语最长短语识别方法

A Novel Prosody Adaptation Method for Mandarin Concatenationbased Text-To-Speech System

基于手机手势识别的媒体控制界面

HMM-based expressive speech synthesis with a flexible Mandarin stress adaptation model

Multi-culture Facial Attractiveness Enhancement Based on Double Knowledge Transfering

摄像机和惯性测量单元的相对位姿标定方法

Which is more suitable for Chinese word segmentation, the generative model or the discriminative one

Approach to selecting best development set for phrase-based statistical machine translation

A framework for effectively integrating hard and soft syntactic rules into phrase based translation

面向广电节目的虚拟人手语合成显示平台研究

Integrating Induced Probability into Decoding for Large Vocabulary Continuous Speech Recognition

基于F范数的信号子空间维度估计的多通道语音增强算法

韵律相关的汉语语音识别系统研究

改进谐波组织规则的单通道浊语音分离系统

基于韵律间断层级的汉语韵律间断分类

基于互补模型的汉语重音检测

基于词汇评分的汉语作文自动评分

排除光流错误跟踪点的鲁棒方法

基于多统计模型和人耳听觉特性的麦克风阵列后滤波语音增强算法

面向窄带通信的极低速率语音编码算法研究

基于声学相关特征与词典语法相关特征的汉语重音检测

机器翻译系统融合技术综述

采用听觉滤波器的宽带MUSIC声源定位方法

Auditory filter based broadband MUSIC algorithm for sound source localization

基于听觉感知特性的信号子空间麦克风阵列语音增强算法

一种改进的单声道混合语音分离方法

Unsupervised Learning of Gaussian Mixture Model with Application to Image Segmentation

From English pitch accent detection to Mandarin stress detection, where is the difference?

Monaural Voiced Speech Segregation Based on Dynamic Harmonic Function

Monaural speech separation based on MAXVQ and CASA for robust speech recognition

汉语大词汇量连续语音识别系统研究进展

Integrating Induced Probability into Decoding for Large Vocabulary Continuous Speech Recognition

基于计算听觉场景分析和语者模型信息的语音识别鲁棒前端研究

基于互补模型的汉语韵律间断自动检测

基于F范数的信号子空间维度估计的多通道语音增强算法

韵律相关的汉语语音识别系统研究

Mandarin stress detection using acoustic, lexical and syntactic features

基于多空间概率分布的汉语连续语音声调识别研究

改进谐波组织规则的单通道浊语音分离系统

基于韵律间断层级的汉语韵律间断分类

Monaural voiced speech segregation based on elaborate harmonic grouping strategies

基于互补模型的汉语重音检测

基于多统计模型和人耳听觉特性的麦克风阵列后滤波语音增强算法

基于声学相关特征与词典语法相关特征的汉语重音检测

Robust front-end for speech recognition based on computational auditory scene analysis and speaker m

汉语大词汇量连续语音识别系统研究进展

基于计算听觉场景分析和语者模型信息的语音识别鲁棒前端研究

汉语韵律短语的时长与音高研究

A modified monaural mixture speech separation method

基于高斯-拉普拉斯-伽玛模型和人耳听觉掩蔽效应的信号子空间语音增强算法

Perceptual properties based signal subspace microphone array speech enhancement algorithm

基于F范数的信号子空间维度估计的多通道语音增强算法

韵律相关的汉语语音识别系统研究

改进谐波组织规则的单通道浊语音分离系统

基于韵律间断层级的汉语韵律间断分类

基于多基音跟踪的单声道混合语音分离

基于互补模型的汉语重音检测

基于多统计模型和人耳听觉特性的麦克风阵列后滤波语音增强算法

基于声学相关特征与词典语法相关特征的汉语重音检测