随着国内电子商务的蓬勃发展,各大电商网站评论量飞速增长,如何从Web大量的商品评论中挖掘出价值信息并进行有效分类对消费者和生产厂商决策有重要的影响。传统分类方法能有效地抽取商品评论中的特征及观点,但对中文评论分类仍存在一些不足。为了进一步提高商品评论分类有效性,首先,综合前人研究提出一种基于评论长度的特征提取方法,提高分类准确率;然后,设计了评论样本自动标注方法,并构建评论的有效性分类模型,改善分类效率;最后,以京东商城上爬取的1710条商品评论为例,对提出的特征提取与自动标注方法进行验证。实验结果表明,根据该方法,评论分类准确率得到明显提高。
With development of e-commerce, the major e-commerce website product reviews increased rapidly. How to utilize the large number of reviews and classify it efficiently make an important impact on manufacturers' decisions. Traditional classification method can effectively extract the product opinions, but not for Chinese reviews. In order to improve the effectiveness of product reviews classification, firstly we present a kind of feature extraction method based on the length of feature to improve classification accuracy; secondly we design a comment sample automatic annotation methods and construct the classification model; and finally, take 1710 product reviews from Jingdong Mall and proposed this methods. The results show that this method could improve the classification accuracy significantly.