互联网业已深入每个人的生活,团购平台、在线商店、在线消费等形式的电子商务平台已成为人们时下最流行的消费方式。几乎所有的电商平台都允许和鼓励用户在消费之后对产品或者服务进行评论,而且用户评论对潜在消费者和商家都具有极高的价值。这使得广告、虚假评论等形式的垃圾评论被人为地夹杂在用户评论中,以期达到虚假宣传、推广产品或者诋毁其他商家信誉的目的。垃圾评论检测和分析便是在这样一种应用背景下,研究如何有效地排除垃圾评论干扰,发挥有效评论价值的方法。针对COAE2015设定的垃圾评论识别任务,利用其提供的语料资源,设计了一种基于启发式规则的半监督垃圾评论分类方法。实验结果证明,提出的方法可以有效地识别垃圾评论,同时能够保持对有效评论的识别精度。
Nowadays the Internet has affected everyone's lives. E-commercial websites such as online-shopping, group purchases, and online consumption have already become most popular consumption patterns. Almost every e-commercial websites enable and encourage their customers to write a review on their products and services. These customers generative reviews are valuable to potentialconsumers and merchants, which leads a situation that spam reviews are added into the e-commercial websites manually on purpose of promoting products or damaging reputation of other merchants. Based on this application background, the spam reviews detection research aims to get rid of spam reviews and to make full use of normal customer reviews. This paper focus on COAE2015-TASK4, which sets up a public task of spam review detection. We proposed a semi-supervised spam review classification method based on heuristic rules using the corpora resources provided by the COAE2015-TASK4. Experiments showed our method can effectively detect spam reviews and keep a high classification accuracy of normal customer reviews.