提出了一种融合下界估计和分段动态时间规整的语音样例快速检索方法。该方法针对缺乏合适的训练数据等语音资源较为有限的语言进行快速检索所设计。此方法首先提取查询样例和测试集的音素后验概率;然后,根据限制条件在测试语句q-选定候选分段,并计算查询样例和每个候选分段之间实际动态时间规整得分的下界估计,再运用K最近邻搜索算法搜索与查询样例相似度最高的分段;最后,使用虚拟相关反馈技术对检索结果进行修正。实验结果表明:尽管此方法的检索精度略低于直接运用动态时间规整进行检索的检索精度,但其检索速度优于后者,且检索结果经过虚拟相关反馈技术修正后,其检索精度也得到有效提升。
A method for query-by-example spoken term detection(QbE STD) using segmental dynamic time warping(SDTW) and lower-bound estimate(LBE) is presented. The approach is designed for tow-resource situations in which limited or no in-domain training material is avail- able. According to this method, the phone posterior probabilities of query examples and test materials should be got firstly, and then the candidate segments are selected in test materials and LBE of actual DTW scores are computed between the query example and all candidate seg- ments in test materials quickly. The K nearest neighbor (KNN) search algorithm is chosen to search for the segments that have maximal similarity. Finally, the retrieval results can be modified by pseudo relevance feedback(PRF). The experimental result indicates that although there is a slight degradation in retrieval precision when compared with formulating DTW proce- dure directly, the retrieval speed of the method presented in the paper is higher than the latter, and the retrieval precision can be enhanced availably after the retrieval results modified by PRF.