对汉语自然对话语音索引问题进行了研究. 比较了不同单元格的识别和检索性能, 提出不同单元格的转换方法、格间的融合方法以及格内节点与边的合并方法. 格转换实现了识别单元和索引单元的分离, 词格转换得到的无调音节格将品质因数(Figure of merit, FOM)从基线系统的69.2%提高到73.7%; 格间融合综合利用多个格的信息, 将FOM进一步提高到78.6%; 格内合并对格进行了有效的压缩, 使其可应用于海量语音检索.
We examine the task of spoken term detection in Chinese spontaneous speech with a lattice-based approach. We compare lattices generated with different units and lattices converted from one unit to another. We find that the best system is with toneless-syllable lattices converted from word lattices whose figure of merit (FOM) is 73.7% from the baseline 69.2%. By combining lattices from multiple systems into a single lattice and fully exploiting the redundant information in the combined lattice with a time-based node/arc merging, we achieve the result of a compact lattice index with the accuracy improved up to 79.2%.