随着大数据时代的到来,各种音频、视频文件日益增多,如何高效地定位关键敏感信息具有非常重要的研究意义.目前研究人员对针对英语和汉语的语音检索技术进行了深入的研究,而针对维吾尔语的语音检索技术还处于起步阶段.该文对维吾尔语语音关键词检索技术进行了研究并采用了大词汇量连续语音识别、利用聚类算法将多候选词图转换为混淆网络、倒排索引、置信度以及相关度的计算等技术和方法,对维吾尔语语音检索系统进行了研究与搭建.最后在测试集上对该系统进行测试,测试结果显示,在语音识别正确率为82.1%的情况下,检索系统的召回率分别达到97.0%和79.1%时,虚警率分别为13.5%和8.5%.
Facing with the age of big data, it is of great importance to locate key sensitive information from various audio and video that are ever-increasing. Although such teachnology named speech retrieval technology has been well addressed in Chinese and English,the Uyghur speech retrieval technology is still in its infancy. This paper investigates this issue and establishes a Uyghur speech retrieval system by using such technologies as of the large vocabulary continuous speech recognition, the confusion network for latice, the inverted index, and relevance estimation. Experimental results show that at the Ievel of 82.1% accuracy rate for speech recognition,the system recall reaches 97.0% and 79.1% ,with the false alarm rates of 13.5% and 8.5%, respectively.