东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

一种快速的基于特征选择的跨领域情感分类方法

ISSN号：1003-5060
期刊名称：《合肥工业大学学报：自然科学版》
时间：0
分类：TP181[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
作者机构：[1]合肥工业大学计算机与信息学院,安徽合肥230009
相关基金：国家高技术研究发展计划（863计划）资助项目（2012AA011005）;国家自然科学基金资助项目（61273292;61305063）和安徽省自然科学基金资助项目（1208085QF122）

关键词：跨领域, 特征选择, 情感分类, cross-domain, feature selection, sentiment classification

中文摘要：

已有的跨领域情感分类方法多通过抽取公共特征空间或建立领域特定特征间的映射关系来消减领域问的差异性，由于不考虑特征情感区分力的差异，使得公共特征空间及特征映射的求解往往不准确。具有高区分力的特征对于文本情感分类具有重要的意义，但标记的缺失使得已有的特征选择方法难以应用。文章基于特征选择方法，提出一种快速的跨领域情感分类方法（cross-domain sentiment classification based on leature selection,CSFS），构建源领域特征与目标领域特征的词共现矩阵，基于该矩阵对目标领域特征的情感区分力进行评估，在目标领域中选择出其中具有高情感区分力的特征；再利用源领域信息计算目标领域特征的情感语义大小，从而构建目标领域分类器。实验结果表明，该方法在保证准确率的前提下，大大提高了跨领域分类的效率。

英文摘要：

Many existing cross-domain sentiment classification methods reduce the distribution difference between domains by extracting a common sub-space or establishing the mapping relationship between domain specific features, and do not consider the difference of features＇ sentiment orientation. Some features with lower sentiment orientation will influence the result of sub-space and mapping relationship. Features with higher sentiment orientation are important for sentiment classification. However, it is difficult to apply existing feature selection methods on unlabeled data. In this paper, a fast cross-domain sentiment classification based on feature selection（CSFS） is proposed. Firstly, the word co-occurrence matrix between the source features and target features is constructed, the sentiment orientation of target domain features is evaluated, and then words with higher sentiment orientation are selected as the feature space of target domain. Secondly, the features in target domain are labeled using the source features, and then a classifier is created based on the labeled features. The empirical result shows that CSFS highly improves the time efficiency of cross-domain classification while maintaining the classification accuracy.

同期刊论文项目