针对越语新闻文本自动分类问题,文章提出一种基于支持向量机的越语新闻文本分类方法.采用支持向量机学习算法,充分考虑文本中命名实体对越语新闻文本分类的特殊作用,分别在句法和语义层面选取词、词性和命名实体作为特征,构建新闻文本分类模型.越语新闻文本分类实验结果表明,提出方法取得了好的效果,命名实体要素对分类有非常好的支撑作用.
For Vietnamese news automatic text classification problems, we propose a Vietnamese news text classification method based on Support Vector Machine, and build a news text classification model with Support Vector Machine learning algorithm,taking full account of the special role of named entities in text to Vietnamese news text classification,selecting word,part of speech and named entity as features on syn- tactic and semantic level. The experimental results show that the proposed method has achieved a good re- suit,and the named entity elements play a very good supporting role to the Vietnamese classification.