为实现维吾尔语网络内容的倾向性分析,进行维吾尔语情感词典的构建研究。首先对现有成果中的情感基准词进行汇总分析,筛选使用频率高、情感倾向强烈的词汇作为维文情感种子词,并利用维文同义词电子词典建立种子扩展词集;其次对HowNet、NTUSD以及大连理工大学开发的情感词典进行并运算,翻译为维吾尔语词汇构成候选词集合;最后利用语料库,计算候选词与种子词以及同义扩展词之间的点互信息值,判别候选词的极性并将其加入到相关的褒贬情感词库中。与汉语句子情感倾向评测实验结果比较,基于该词典的维吾尔语句子倾向性判断准确率和召回率基本相同。
In order to achieve the orientation of Uyghur web content analysis, this article studies Uyghur sentiment word dictionary building. At first polled analysis is carried out on the existed research results of emotional benchmark words,screening out most frequently used and strong emotional tendency words as Uyghur emotional seed words. Then the seed expand word set is formed by the Uyghur synonyms dictionary system. Secondly implement intersection operation on emotional word set in HowNet, NTUSD and emotional word set distribucted by Dalian university of technology, then translate the words in the set into Uyghur language vocabulary and form the candidate emotional word set. Finally using corpus, the mutual information value is calculated between the word in candidate words set and the word in the seeds set and expand word set. Based on the result the polarity of the candidate word is distinguished and the word is added to the related emotion word library. Compared with Chinese evaluation result, the Uyghur sentence tendentiousness judgment accuracy and recall rate is the same by the Uyghur emotional words set achieved in this paper.