随着新浪等微博用户的不断增长,微博网站已经成为人们获取信息和创造信息的主要平台。现有微博平台的检索功能只能靠关键词匹配返回检索结果,使得检索结果无法满足用户需求。为了解决该问题,提出一种基于HowNet知识库系统的微博语义检索方法。利用HowNet知识库系统分别将中文待检索主题词和微博文本词汇进行语义相关度匹配,返回和待检索词汇语义相关度较高的微博文本,最后针对新浪微博数据集进行语义检索实验。实验结果表明,利用HowNet系统能够从语义层面上获得较高的查准率,为用户提供更满意的检索效果。
Microblog semantic retrieval has become the main platform for users to access and create information. Current retrieval in microblog mainly depends on the keyword matching rather than semantic analysis, which makes the results can' t satisfy user' s demand. To overcome this problem, we proposed a method for microblog semantic retrieval based on HowNet. Firstly, we used HowNet to conduct semantic relevance computation between query term and microblog terms on Chinese corpora. Then, the microblog texts containing terms with high degree of correlation were returned. Finally, we conducted experiments on Sina Microblog. The experimental results show that our method can obtain high accuracy on semantic level, and achieve better retrieval results.