在使用相同分词算法的情况下,词典机制决定着词语的查询速度,也影响着分词的速度和分词系统的广泛应用。根据词语在文本中出现频率的不同,通过构造次优查找树的词典机制,使得在分词过程中减少了比较次数,提高了分词的速度。最后采用最大逆向分词算法进行了对比实验,实验表明分词效率有一定提高。
Dictionary mechanism exerts great effect not only on the dictionary query speed, but also the speed of segmentation and the wide use of the segmentation system. According to the different occurrence frequencies of words in the text, a dictionary mechanism of the nearly optimal search tree is designed, which is meant to reduce the number of times for comparison during segmentation and to pick up its speed. Finally, contrast experiments with maximal reverse segmentation algorithm are conducted which show that the efficiency of segmentation is improved to some degree.