为TNNA体免疫缺陷蛋白酶抑制剂的活性,计算了表征分子的组成和拓扑特征的462个分子描述符,用Kennard-Stone方法和随机方法进行了训练集和测试集设计,用Monte Carlo模拟退火方法进行变量筛选,并分别用神经网络。逻辑回归,k-近邻和支持向量学习机方法建立了HIV-1蛋白酶的抑制剂模型.结果表明支持向量学习机优于其余机器学习方法,用SVM方法所建立的最优模型的最后预测正确率达到98.24%.
In order to predict the activity of HIV protease inhibitors, constitutional and topological descriptors, in total 462, were calculated to characterize the structural and physicochemical properties for each molecule under study. The Kennard-Stone method and a random method were adopted to design the training set and the test set. Monte Carlo simulated annealing method was applied to the variable selection. Machine learning methods including support vector machine, artificial neural network, logistic regression, and k-nearest neighbor, were applied to the development of inhibitor models. It was shown that the support vector machine method outperforms the other methods and the final model developed using the SVM method gave a prediction accuracy of 98.24%.