在复杂的声学环境中,由于环境噪声的干扰,导致声学特征的稳定性不够理想.为克服此难题,通常对决策结果在时间维度上进行平滑.然而,这些平滑过程本身没有考虑数据在时间维度上的结构特征,属于启发式的方法.该文采用动态分割的方法,将语音的频谱包络在时间维度上分割成具有特征同一性的时间块,以分割块为单位计算能量特征,并进行语音/非语音决策,从而达到提高语音端点检测的稳定性目的.实验表明,提出的方法有效提高了语音端点检测的鲁棒性.
The acoustic feature is not robust enough due to the interference of environmental noises.Some heuristic approaches of smoothing noisy spectra were introduced to treat with this problem.But those methods did not consider the intrinsic correlation in the time domain.This paper presents a novel method of endpoint detection,where the time sequence of logarithmic power was partitioned into homogeneous blocks using dynamic auto-segmentation.The acoustic feature was extracted from each homogenous block.The endpoint detection was conducted based on the unit of homogenous block.The experimental results showed the superiority of the proposed method.