针对现存的基于EM(Expectation maximization)迭代的无指导词义消歧方法收敛缓慢、计算量大的问题,利用互信息和Z-测试结合的方法选取特征,并通过一种统计学习算法估算初始参数值.实验结果表明改进方法有效地提高了汉语词义消歧的准确率,具有良好的扩展性和实用性.
The existing word sense disambiguation methods based on expectation maximization (EM) unsupervised learning need a large amount of computation and converge slowly. To address the problems, an improved method is proposed, which makes use of mutual information theory based on Z-test to select features and uses a statistical learning algorithm to estimate initial parameter values. The experimental result shows that the proposed method improves effectively the precision of word sense disambiguation and has good expansibility and practicability.