【目的】通过对电子商务评论文本的分析和处理,获取有效的商家信誉信息,从客观角度建立商家信誉维度体系。【方法】基于HNC理论的同行优先原理和文本挖掘方法提出改进的评论文本主题词抽取方法和主题词聚类算法,并进行类簇标签抽取及各类簇权重计算。【结果】生成商家信誉维度体系及各维度权重,以京东平台手机评论文本为实例,构建商家信誉维度体系,并对其进行评价,证明方法的可行性与有效性。【局限】受HNC词库不全的影响需手工生成一部分字词符号,在应用到更大规模的评论文本处理时可能会存在限制。【结论】利用本文提出的方法建立的商家信誉维度体系能够客观地反映出用户真正关心的商品指标。
[Objective] This paper proposes a new method to evaluate business reputation based on e-commerce comments. [Methods] First, we modified the key word extraction and clustering algorithm based on the HNC theory and text mining methods. Then, we extracted the cluster labels and calculated the weight of each cluster of the collected comments. [Results] We established a business reputation dimension system, with cellphone users' reviews posted on the Jingdong Online Shopping Platform. [Limitations] Some of the word symbols were generated manually due to the incomplete HNC thesaurus, which posed negative effects to larger-scale comments analysis. [Conclusions] The business reputation evaluation system can identify the commodity features that users really care about.