提出一种基于声学分段模型的无监督语音样例检测方法。该方法首先利用高斯混合模型(Gaussian mixture model ,GMM)将训练数据频谱参数转换为后验概率特征向量,采用层次聚类算法确定后验概率的边界信息,得到声学分段;然后通过k‐m eans算法将片段聚类并添加标签,构建基于后验概率的声学分段模型。检索时以模型对查询样例与检索文档的解码序列代替测量矩阵以降低检索时间,通过基于最小编辑距离的动态匹配检索查询项,最小编辑距离的代价函数由模型相似度距离矩阵修正。实验结果表明,相比GM M及传统声学分段模型,本文提出的方法性能更好,检索速度得到显著提升。
A study of acoustic segment models (ASM s) for unsupervised query‐by‐example spoken term detec‐tion is presented .Firsty ,a Gaussian mixture model(GMM) is trained without any transcription information to label speech frames with Gaussian posteriorgram .Hierarchical agglomerative clustering is used to decompose the posterior features into acoustically exhibiting segments .A label is assigned to each result segment by k‐means clustering ,then posteriorgram is faciltitated to train ASMs .In query matching phase ,Viterbi decode is proposed to represent query and test posteriorgrams as ASM sequences .Dynamic match lattice spotting based on minimum edit distance is used to locate possible occurrences of the query term .Experimental results show that the proposed method outperforms traditional GMM and ASMs tokenizers .