随着信息交流的频繁性,各种骚扰和垃圾短信充斥手机,严重干扰了人们的正常生活。针对垃圾短信过滤技术,研究基于最小风险决策贝叶斯的文本分类器构造方法以及实现。对于朴素贝叶斯在短信过滤系统中过分依赖样本空间的分布和内在的不稳定性,造成了时间复杂度的增加,提出了一种基于改进贝叶斯的垃圾短信文本分类器构造方法。主要利用最小风险决策算法结合贝叶斯理论完成对批量短信的训练,形成对应的集合模型。对实现文本分类的关键技术做了重点叙述,并对文本分类算法进行了实现。最后对算法进行测试,结果表明:基于最小风险决策贝叶斯的文本分类器不仅训练简单,而且分类准确度高,解决了朴素贝叶斯算法的不稳定性,为短信过滤技术提供了借鉴。
With frequent exchanges of information,various harassing messages with mobile phone disturb the normal life for people. For spam filtering technology,research the constructing method and its realization for text classifier based on optimized Na?ve Bayesian algo-rithm. The distribution of Naive Bayesian over-reliance on sample space in the short message filtering system and the inherent instability cause an increase in time complexity,propose a spam message structure text classifier based on the improved Bayesian method. The meth-od uses the Bayesian theory and minimum risk decision algorithm to complete the training of bulk SMS. Describe the key technologies of text classification and implement the text classification algorithm. The test results show that the new algorithm can easily train and im-prove the classification accuracy,solving the instability of Na?ve Bayesian algorithm,which provides a reference for filtering technology.