针对爆发谱特征不稳定的问题,论文提出了一种基于能量变化率的汉语塞音检测方法.该方法首先基于Seneff听觉谱提取了一组描述音段能量变化率特性的参数,然后采用Fisherface方法进行特征变换,变换后的特征采用K近邻(KNN)分类器进行分类,实现了塞音的检测,最后利用留一法对模型性能进行交叉验证.实验结果表明,干净语音塞音检测准确率可以达到96.39%,信噪比10dB的语音塞音检测准确率可达到88.07%,模型具有较好的稳定性和泛化性能.
In order to solve the issue of unreliable burst spectrum feature, a Chinese stop detection method based on energy change rate characteristic is proposed. The energy change rate features are first acquired from the Seneff's au- ditory spectrum, and then transformed by Fisherface approach. Finally the KNN classifier is implemented to realize stop detection. Tested by leave-one-out cross validation, the results indicate a good performance of high stability and generalization: the accuracy is 96.39% for clean speech and 88.07% for noisy speech with the SNR of 10dB.