基于文本聚类技术在移动通信行业客户服务文本记录分类中的应用研究,构建了文本分类处理的概念模型。采用集合式表示方法对客户知识进行定义,通过向量空间模型进行文本转化和数据矩阵的构建,提出了TF—MI函数进行特征词的权重计算,利用层次聚类进行数据处理,并通过类别判断的4条准则进行了聚类结论分析和讨论,从而进一步强调了文本聚类技术在移动通信行业客户服务系统知识获取工作中的实用价值。
The application of the text clustering technology to the customer service records in Mobile Communications Industry was emphasized. The notional model of knowledge classification based on the text clustering and a concrete executive approach is proposed. The Sets Concept was used to make the definition of the customer knowledge in Mobile Communications Industry. Vector Space Method (VSM) was introduced during knowledge transformation and matrix construction. And TF- MI function is proposed to calculate the weight of the characteristic words. Different approaches to clustering were compared, and hierarchical clustering analysis was explored. The result of the clustering was analyzed and verified by using the four rules presented by Ma Qingguo.