针对维吾尔语人称代词指代消解研究忽略了待消解项识别而引入了噪声的问题,提出一种基于深度置信网络(Deep Belief Networks,DBN)的维吾尔语人称代词待消解项识别方法。在分析维吾尔语人称代词语法特征和语言规则的基础上,总结出包含10项特征的维吾尔语人称代词待消解项特征集。所提方法首先通过逐层贪婪地训练每一层受限玻尔兹曼机(Restricted Boltzmann Machine,RBM)网络,来保证特征向量映射到不同的特征空间,尽可能多地保留特征信息;并在最后一层设置BP网络,对RBM输出的特征向量进行分类,以有监督的方式训练整个网络并进行微调。实验结果表明,所提方法正确识别维吾尔语人称代词待消解项的准确率达到95.17%,比SVM算法提高了9%,从而验证了其有效性和可行性。
Aiming at the problem that the noise was introduced into the research about anaphoricity determination of personal pronouns in Uyghur language,we represented a method based on deep belief networks(DBN).On the basis of analyzing the grammatical features and linguistic rules of personal pronouns in Uyghur language,we summarized the anaphoricity determination feature set containing ten features.First of all,the Restricted Boltzmann Machine(RBM)network is trained layer by layer in a greedy way,in order to make sure that the feature vector is mapped to the different space so that the characteristic information can be retained as much as possible.Then,the BP network in the last layer is set up and the features of the output vector about RBM are classified,as well as the entire network is trained in a supervised way and it is fine-tuned.The experimental result shows that the accuracy rate of correct recognition of anaphoricity determination about Uyghur personal pronouns reaches 95.17%,which is improved by 9% compared to that of SVM algorithm,and the validation and availability of the method are demonstrated.