东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

用于语音识别置信度的发音特征各维度分析和子集优化

期刊名称：声学学报
时间：0
页码：339-348
语言：中文
分类：O236[理学—运筹学与控制论;理学—数学]
作者机构：[1]中国科学院声学研究所中科信利语音实验室,北京100190
相关基金：国家科技支撑计划（2008BAI50B03）; 国家自然科学基金（10925419 90920302 10874203 60875014）资助项目
相关项目：腭裂代偿性发音的声学特征分析、建模和客观评估方法的研究

作者：赵庆卫|孙艳庆|张晴晴|周瑜|颜永红|

关键词：置信度估计, 语音识别, 特征, 发音, 维度, 优化, 子集, 隐马尔可夫模型, Feature extraction, Redundancy

中文摘要：

提出了基于发音特征单个维度的置信度算法,并基于此对发音特征的各个维度展开分析。分析不仅验证了融合的必要性,同时也展示了发音特征各维度之间以及和隐马尔可夫模型之间的大量冗余。为了去除冗余,提出了用子集选择的方法进行优化。对比所有都用的情况,基于发音特征紧凑子集的语音识别置信度估计,在等错率上取得了12.7%的相对下降。把经过优化后的基于发音特征的语音识别置信度估计和基于隐马尔可夫模型的语音识别置信度进行融合,在保持集内识别率不损失的前提下,显著提高了语法外输入测试的拒识性能：在相同参数下,在开发集和测试集上分别取得了34%和35.3%的显著改善。

英文摘要：

Different articulatory properties are analyzed in terms of confidence measures using a separate AF-based confidence calculation method.The analysis not only verifies the necessity of assembly,but also demonstrates a great deal of redundancies between the articulatory properties and HMM.In order to reduce the redundancy,a subset selection method is proposed.Experiments are designed to verify the above assumptions.Compared with all used together,the confidence measures based on the compact subset of articulatory features get a relative decrease of 12.7%for EER.The optimized AF-based confidence is finally combined with the HMM-based confidence,and increases rejection rate for the out of vocabulary tests with no accuracy loss of the in vocabulary tests,and the relative improvement is 34%on the development sets and 35.3%on the testing sets with the same parameters.

同期刊论文项目

口吃语音的自动评估和矫正方法研究

期刊论文 18 会议论文 14

多人多方对话中的语音分离、内容分析与理解

期刊论文 111 会议论文 69 获奖 6

腭裂代偿性发音的声学特征分析、建模和客观评估方法的研究

期刊论文 24 会议论文 15 专利 4

面向语音处理的言语声学数字化建模

期刊论文 81 会议论文 59

同项目期刊论文

Development of a Mandarin-English Bilingual Speech Recognition System with Unified Acoustic Models

基于混合模型状态修正算法的非母语语音识别

嵌入式语音识别中一种高效的图搜索算法

An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns

最小方差无失真响应感知倒谱系数在说话人识别中的应用

成年口吃者流畅朗读中塞音的声学分析

多特征融合的英语口语考试自动评分系统的研究

联合因子分析和稀疏表示在稳健性说话人确认中的应用

基于在线语音流的字幕自动生成系统算法研究与实现

英语篇章朗读质量的自动评分

用于版权管理的数字音频水印算法

语言声学与内容理解研究进展

Acoustic characteristics of stop consonants in fluent reading Chinese Putonghua speech of adult stutterers

幼儿腭裂术后语言发育与腭咽闭合功能恢复关系的研究

The effects of unrepaired cleft palate on early language development in Chinese infants

嵌入式语音识别中一种高效的图搜索算法

腭裂手术年龄对代偿性发音发生率的影响

最小方差无失真响应感知倒谱系数在说话人识别中的应用

多特征融合的英语口语考试自动评分系统的研究

Towards precise and robust automatic synchronization of live speech and its transcripts

联合因子分析和稀疏表示在稳健性说话人确认中的应用

基于在线语音流的字幕自动生成系统算法研究与实现

英语篇章朗读质量的自动评分

用于版权管理的数字音频水印算法

一种基于共面圆的摄像机自标定算法

语言声学与内容理解研究进展

用于噪声鲁棒性语音识别的子带能量规整感知线性预测系数

波场合成中声像感知距离重建

Objective evaluation of cleft palate speech based on analyzing plosive consonants

融合测程法与视觉信息的足球机器人自定位方法

对听感觉运动门控自上而下调节的动物模型和神经机制

Informational masking of speech produced by speech-like sounds without linguistic content

Discriminative training of GMM-HMM acoustic model by RPCL learning

Perceptual MVDR-based cepstral coefficients (PMCCs) for speaker recognition

最小方差无失真响应感知倒谱系数在说话人识别中的应用

Effects of aging on the ability to benefit from prior knowledge of message content in masked speech

Tone Enhancing Model for Disyllable Words in Chinese Mandarin Speech

基于扩展N元文法模型的快速语言模型预测算法

Improved Keyword Spotting System in Weighted Finite-State Transducer Framework

成年口吃者流畅朗读中塞音的声学分析

A Novel Discriminative Method for Pronunciation Quality Assessment

Aging effects on detection ofspectral changes induced by a break in sound correlation

Discriminative GMM-HMM Acoustic Model Selection Using Two-Level Bayesian Ying-Yang Harmony Learning

Lightly Supervised Acoustic Model Training for Mandarin Continuous Speech Recognition

A comparative study of RPCL and MCE based discriminative training methods for LVCSR

Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise

Relationship between Distance and Binaural Cues on Sound Source Localization (in Chinese)

Harmonic Structure Features for Robust Speaker Diarization

Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation

多特征融合的英语口语考试自动评分系统的研究

Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese,

A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features

语言声学的最新应用

Speech Enhancement Using Robust Generalized Side lobe Canceller with Multi-Channel Post-Filtering in

Factor Analysis of Neighborhood Preserving Embedding for Speaker Veri?cation

Language Recognition with Language Total Variability

Logarithmic adaptive quantization projection for audio watermarking

A Novel Similarity Measure to Induce Semantic Classes and Its Application for Language Model Adaptat

Noise Estimation Using a Constrained Sequential Hidden Markov Model in the Log-Spectral Domain

基于空间声场扩散信息的混响抑制方法

基于NIST评测的说话人分类及定位技术研究

一种基于帧-音符方式的哼唱检索算法

集合分类中的鉴别式局部信息距离保持映射

基于线性对数似然核函数的说话人识别

基于MLER和GMM的语音音乐分类

各种不同的基于词格的鉴别性训练方法在中文单语以及中英双语语音识别系统中的性能改善调研及比较(英文)

大规模词表连续语音识别引擎紧致动态网络的构建

Multi-stream posterior features and combining subspace GMMs for low resource LVCSR

Multi-resolution time frequency feature and complementary combination for short utterance speaker re

基于核函数的IVEC-SVM说话人识别系统研究

基于总体变化子空间自适应的i-vector说话人识别系统研究

基于多模态信息融合的语音意图理解方法

利用领域信息的基于字的鲁棒中文口语理解研究

Acoustic characteristics of stop consontants in fluent reading Chinese Putonghua speech of adult stu

Auditory frequency-following response: a neurophysiological measure for studying the “cocktail-party

联合因子分析和稀疏表示在稳健性说话人确认中的应用

Discrimination Between Pathological and Normal Voices Using GMM-SVM Approach

Fast Speech Recognition System Using Weighted Finite-State Transducers

Voice Activity Detection Based on an Unsupervised Learning Framework

Speaker Recognition Using Sparse Probabilistic Linear Discriminant Analysis

A Forced Alignment Based Approach for English Passage Reading Assessment

基于在线语音流的字幕自动生成系统算法研究与实现

Factor Analysis of Neighborhood-Preserving Embedding for Speaker Verification

Discriminative Decision Function Based Scoring Method Used in Speaker Verification

基于TLS-NAP的文本无关说话人识别算法

Soccer Robot Self-Localization by Combining Odometry and Visual Information (in Chinese)

联合因子分析中的本征信道空间拼接方法

基于优化检测网络和MLP特征改进发音错误检测的方法

Adding irrelevant information to the content prime reduces the prime-induced unmasking effect on spe

英语篇章朗读质量的自动评分

用于版权管理的数字音频水印算法

Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Sp

Enhanced Word Classing for Recurrent Neural Network Language Model

Perceptual MVDR-based cepstral coefficients for speaker recognition

双耳时间差和强度差与声源距离线索关系的研究

三洋摄像机调焦聚焦噪声抑制技术研究

数字摄像机自动聚焦机械噪声消除方法研究

基于隐藏单元条件随机场的多知识源融合改进自动语音识别置信度

多领域系统融合在语音云系统中的应用

利用二重打分方法的激活词语音识别

鉴别性最大后验概率线性回归说话人自适应研究

语音中元音和辅音的听觉感知研究

鉴别性最大后验概率声学模型自适应

中文口语理解弱监督训练方法

语言声学与内容理解研究进展

基于频域逐级回归的声学回声控制

行驶汽车环境中的话音活动检测研究

面向语音增强的约束序贯高斯混合模型噪声功率谱估计

HMM-based noise estimator for speech enhancement

Acoustic characteristics of stop consonants in fluent reading Chinese Putonghua speech of adult stutterers

A forced alignment approach to detect Chinese repetitive stuttering

汉语连续语音识别系统中三音子模型的优化

基于加权有限状态机的动态匹配词图生成算法

基于PLDA的多信道多语音说话人确认研究

高斯PLDA在说话人确认中的应用及其联合估计

面向口语统计语言模型建模的自动语料生成算法

用于噪声鲁棒性语音识别的子带能量规整感知线性预测系数

波场合成中声像感知距离重建

Noise Robust Feature Scheme for Automatic Speech Recognition Based on Auditory Perceptual Mechanisms

Synthesis of Perceived Distance in Wave Field Synthesis

单通道语音增强算法对汉语语音可懂度影响的研究

汉语发音质量评估的实验研究

混合双语语音识别的研究

Low-dimensional Representation of Gaussian Mixture Model Supervector For Language Recognition

Perceptual MVDR-based cepstral coefficients (PMCCs) for speaker recognition

最小方差无失真响应感知倒谱系数在说话人识别中的应用

成年口吃者流畅朗读中塞音的声学分析

多特征融合的英语口语考试自动评分系统的研究

基于SVM一对一分类的语种识别方法

联合因子分析和稀疏表示在稳健性说话人确认中的应用

Efficient System Combination for Chinese Spoken Term Detection

语义类的提取及其在语音搜索系统中的应用

Robust and Fast Localization of Single Speech Source Using a Planar Array

基于在线语音流的字幕自动生成系统算法研究与实现

Detecting anticipatory effects in speech articulation by means of spectral coefficient analyses

A bayesian logistic regression approach to spoken language identification

Maximum a Posteriori Linear Regression for Language Recognition

英语篇章朗读质量的自动评分

用于版权管理的数字音频水印算法

Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition

基于发音特征的汉语普通话语音声学建模

基于在线语音流的字幕自动生成系统

Block based language model for target domain adaptation towards Web corpus

基于隐藏单元条件随机场的多知识源融合改进自动语音识别置信度

多领域系统融合在语音云系统中的应用

利用二重打分方法的激活词语音识别

鉴别性最大后验概率线性回归说话人自适应研究

语音中元音和辅音的听觉感知研究

鉴别性最大后验概率声学模型自适应

中文口语理解弱监督训练方法

语言声学与内容理解研究进展

基于频域逐级回归的声学回声控制

行驶汽车环境中的话音活动检测研究

面向语音增强的约束序贯高斯混合模型噪声功率谱估计

Acoustic characteristics of stop consonants in fluent reading Chinese Putonghua speech of adult stutterers

A forced alignment approach to detect Chinese repetitive stuttering

汉语连续语音识别系统中三音子模型的优化

基于加权有限状态机的动态匹配词图生成算法

基于PLDA的多信道多语音说话人确认研究

高斯PLDA在说话人确认中的应用及其联合估计

面向口语统计语言模型建模的自动语料生成算法