分别采用支持向量学习机、人工神经网络、调节性逻辑回归和K-最临近等机器学习方法对761个二氢叶酸还原酶抑制剂建立了其活性分类预测模型.采用组成描述符和拓扑描述符表征抑制剂的分子结构及物理化学性质,使用Kennard—Stone方法进行训练集的设计,并用Metropolis Monte Cado模拟退火方法作变量选择.结果表明,支持向量学习机优于其它机器学习方法,所得到的最优模型具有较好的预测结果,其预测正确率为91.62%.说明通过合适的训练集设计及变量选择,支持向量学习机方法可以很好地用于二氢叶酸还原酶抑制剂的活性分类预测.
Machine learning methods, including Support Vector Machine, Artificial Neural Network, Regularized Logistic Regression and K-Nearest Neighbor, are used to develop the classification models for a set of 761 DHFR inhibitors. Constitutional descriptors and topologieal deseriptors are calculated to charaeterize the structural and physicochemical properties of compounds and Kennard-Stone method is used to design the training set and Metropolis Monte Carlo simulated method is used for feature selection. It is shown that SVM method outperforms other machine learning methods used in this study and the final SVM model after feature selection can give a prediction accuracy of 91.62%. This suggests that SVM method with proper training set design and feature selection is potentially useful for the prediction of the activity of a diversity set of DHFR inhibitors.