支持向量机(SVM)具有较强的“黑箱性”,用户对数据的处理过程难以理解。基于回归树算法可有效提取规则,但是传统回归树算法叶节点通常采用算术平均数作为规则的结果,缺陷是到达叶节点样本较多且目标值浮动较大会导致训练和预测的准确度严重下降,而目精度严重依赖于终止条件的设定值。由此提出的改进算法是在回归树的叶节点处采用最小二乘法拟合出对应的函数表达式,代替原算法中的算术平均数。应用煤制甲醇数据进行验证,结果表明改进的回归树算法相比于传统回归树算法训练精度提高了10.6%,预测准确度提高16.3%,同时也有效避免终止条件取值的盲目尝试性,提高实验效率。
Support Vector Machine (SVM) has a strong "black box" problem; the process of data processing is diffcult to understand for users. The rules can be extracted effectively based on the regression tree algorithm, but the traditional regression tree algorithm for the leaf nodes usually uses the arithmetic average as the result of rules. The flaw is that the leaves has more samples and the target value is floating, causes a serious decline in the accuracy of training and predicting, and the accuracy depends heavily on the setting values of the termination conditions.So we propose an improved algorithm that is used to fit the corresponding funetion expression by using the least square method at the leaf nodes of the tree, instead of the arithmetic mean of the original algorithm. Improved regression tree is used in coal methanol data, the results show that the training accuracy of improved regression tree algorithm is improved by 10.6% compared with the traditional regression tree algorithm, and the prediction accuracy is increased by 16.3%,at the same time, it also can avoid the blindness of the value of the termination conditions and improve the experimental efficiency.