采用主成分分析法对样本数据集进行预处理,将得到的新样本数据集输入支持向量机,籍均匀设计,构建了几丁质酶氨基酸组成和最适pH的数学模型。当惩罚系数C为10,epsilon值为0.7,Gamma值为0.5,模型对pH值拟合的平均绝对百分比误差为3.76%,同时具有良好的预测效果,预测的平均绝对误差为0.42个pH单位。该方法比用BP神经网络方法效果更佳。
The principal component analysis(PCA) was applied to the data processing in training sets, the new principal components were then used as input data for support vector machine model. A prediction model for optimum pH of chitinase was established based on uniform design. When The regularized constant C, epsilon and Gamma were 10, 0.7 and 0.5 respectively, the calculated pHs fitted the reported optimum pHs of chitinase very well and the MAPEs (Mean Absolute Percent Error) was 3.76%. At the same time, the predicted pHs fitted the reported optimum pHs well and the MAE (Mean Absolute Error) was 0.42 pH unit. It was superior in fittings and predictions compared to the model based on back propagation(BP) neural network.