文本自动分类的一项关键技术是特征选择。本文针对信息过滤的特点,对特征选择方法进行了改进,提出了一种基于语义神经网络的文本特征选择方法。首先对原始特征集进行初始筛选,去除冗余特征及噪声后,对得到的特征子集采用语义神经网络进行智能的特征选择,其核心是关联度及激活变量的计算。从而得出代表问题空间的最优特征子集,实现降维并提高分类精度。实验证明,该方法可以极大地降低文本的维数,提高文本过滤的质量。
Feature selection is a key technology of the automatic text classification. We improve the present feature selection approaches for information filtering features,and present a semantic neural network based text feature selection approach. We initially select the original feature set and remove the redundant features and noise, and then intelligently employ semantic neural network to perform feature selection for an abtained feature subset, the core of which is the calculation of the associated degrees and variable activation . Therefore we obtain the optimal space and the decreased dimensions and improve the accuracy of classification . Experiments show that the approach can greatly reduce the dimensions of the text and improve the quality of text filtering.