为克服传统关键词提取算法局限于字面匹配、缺乏语义理解的缺点,提出一种基于语义的中文文本关键词提取(SKE)算法。将词语语义特征融入关键词提取过程中,构建词语语义相似度网络并利用居间度密度度量词语语义关键度。实验结果表明,与基于统计特征的关键词提取算法相比,SKE算法提取的关键词能体现文档的主题,更符合人们的感知逻辑,且算法性能较优。
In order to overcome the limitation of literal matching and lacking semantic concept of the traditional Keyword extraction algorithm,this paper presents a Semantic-based Keyword Extraction(SKE) algorithm for Chinese text.It uses semantic feature in the keyword extraction process and constructs word semantic similarity network and uses betweenness centrality density.Experimental results show that compared with the statistic based keyword extraction algorithm,the keywords SKE algorithm extracted are more reasonable and can represent more information of the document's topic,and the SKE algorithm has a better performance.