通过提出一种新的训练语料算法,结合训练语料在二元模型上采用正向与逆向双向扫描方法进行搜索,完成训练语料的扩充,并给出了对Viterbi算法的改进算法.对比实验在二元模型上采用不同规模的训练语料对同一规模的测试分析语料进行了分析.结果表明,该算法是可行的.
One piece of primary work in hidden Markov model is calculating parameter which is often premised for part-of-speech tagging by using Viterbi algorithm. In this paper, the authors presents a new corpus training algorithm, combined with training data on binary model search using the forward and reverse hi-directional scanning method,in order to complete the expansion of the training corpus, and put forward the improvement of Viterbi algorithm. The authors adopt training corpus of different scale to test, compare and analyze testing corpus of same scale based on the result of the work in training corpus, the results show that the algorithm is feasible.