人的那个听觉的系统有自动语音识别 ASR 系统斜面火柴,和部分 Fourier 变换 FrFT 在非静止的信号处理有唯一的优点的优秀性能,是众所周知的。在这份报纸, Gammatone filterbank 为前端被用于讲话信号时间的过滤,然后产量 subband 信号的声学的特征基于部分 Fourier 被提取变换。就为 FrFT 的变换顺序的批评效果而言,一个顺序改编方法基于即时频率被建议,并且它的表演基于歧义功能与方法相比。ASR 实验在干净、吵闹的 Putonghua 位上被进行,并且结果证明建议特征基于即时频率比 MFCC 基线,和顺序改编方法完成显著地更高的识别率基于歧义比那有低得多的复杂性函数。进一步更,基于 FrFT 的特征用建议顺序改编方法完成最高的识别率。
It is well known that auditory system of human beings has excellent performance which automatic speech recognition (ASR) systems can't match, and fractional Fourier transform (FrFT) has unique advantages in non-stationary signal processing. In this paper, the Gammatone filterbank is applied to speech signals for front-end temporal filtering, and then acoustic features of the output subband signals are extracted based on fractional Fourier transform. Considering the critical effect of transform order for FrFT, an order adaptation method based on the instantaneous frequency is proposed, and its performance is compared with the method based on ambiguity function. ASR experiments are conducted on clean and noisy Putonghua digits, and the results show that the proposed features achieve significantly higher recognition rate than the MFCC baseline, and the order adaptation method based on instantaneous frequency has much lower complexity than that based on ambiguity function. Further more, the FrFT-based features achieve the highest recognition rate using the proposed order adaptation method.