随着网络技术和媒体应用的迅速发展,传统的文本检索已不能满足需要,视频检索由于数据量大而得不到应用,语音检索就显示出重要的研究价值.一个语音序列由多种不同类型的语音片段构成,而每一种类型的语音往往又包含不同的意义,因此通过语音特征进行语音分段来实现语音检索是现代媒体数据进行检索的重要手段.通过对语音信号每一帧的基本特征值与整个语音序列的平均基本特征值进行比较,得到一个改进的特征值,并利用K—Nearest Neighbor算法进行语音分割,结果表明基于改进特征值的语音分割算法能够有效提高语音分割的准确性.
With the rapid development of internet technology and media application, text-based retrieval cannot satisfy the requirements and auditory-visual processing can not be applied for the large data amount, so the emergence of speech retrieval is particularly important. An audio clip usually consists of many different types of audio segments with different meanings ; therefore, it becomes a new method to perform speech retrieval with audio segmentation for modern media based on audio eigenvalue. In the article, the basic eigenvalue of each audio frame is compared with the average eigen- value of the entire audio clip and then the improved eigenvalue can be obtained for audio segmentation by using the K- Nearest Neighbor algorithm. The experimental results show that the proposed algorithm based on the improved eigenvalue can efficiently improve the accuracy of audio segmentation.