丰富的电商领域的产品概念层次体系,有助于全面了解产品属性,进行产品信息的深度挖掘,从而进一步用于挖掘消费者需求,辅助商业决策等。传统的手工构建方法效率低、成本高。现今,海量的电商评论信息包含了大量的产品属性信息,能够用于构建产品概念层次体系。因此,文章以电商评论为数据来源,利用条件随机场抽取产品候选术语;然后结合深度学习与聚类方法生成产品的概念层次体系。该方法效率高,动态更新难度低,而且通用性比较强。实验结果表明:产品术语抽取的准确率、召回率以及F1值分别为:90.17%、70.87%、79.47%,生成的两层概念层次体系共包含87个概念。与已有概念层次体系相比,该概念体系层次清晰,易于理解,同时直接利用产品评论数据,获得的术语关注度高,更贴近产品评论挖掘的实际应用需要。
The rich product concept hierarchy in electronic commerce field is useful to understand product attribute, deeply mine product information and consumer demand, and make commercial decisions. The traditional method by manual building is not efficient and cost high. By contrast, the recent e-commerce reviews can be used to build product concept hierarchy for its grand number of product attribute information. Therefore, this paper takes electronic commerce reviews as the data, and uses a Condition- al Random Fields (CRF) method to extract candidate terms of products. Then, the paper combines a deep learning method and a clustering method to generate the product concept hierarchy. The method in the paper is effective, which has good dynamic update facihty and good universality. The result shows the precision rate, recaU rate and Fl-measure of term extraction are: 90. 17%. 70. 87% and 79.47%, respectively, which create two layers concept hierarchy contain 87 concepts. Compared with existing con- cept hierarchy, the proposed hierarchy is more clear and easy to be understood. Meanwhile, the method can obtain term with high attention by making full use of the data of product reviews, and the method is more close to the practical application of product re- views mining.