基于定量结构-活性相关性(QSAR)原理,研究了106种脂肪族化合物结构与其急性毒性LC50(半数致死浓度)之间的内在定量关系。应用遗传算法从大量结构参数中优化筛选出与LC50最为密切相关的4个参数作为分子描述符,分别采用支持向量机(SVM)方法和多元线性回归(MLR)方法建立了相应的QSAR预测模型。分别采用内部验证及外部验证的方式对所建模型性能进行了验证。研究表明。2种模型均具有较高的稳定性、预测能力及泛化性能。其中支持向量机模型对训练集和预测集样本的预测平均绝对误差分别为0.336和0.364,优于多元线性回归方法所得结果。
This paper is concerned about its study on the quantitative relationship between the acute toxicity (LC50) and the molecular structure of 106 alipbatic compounds based on the quantitative structure-activity relationship (QSAR) model. The so-called QSAR model is by nature a newly developed method for predicting the properties of ehemo informaties based on the basic theory of chemistry that molecular properties are determined by the molecular structures and the intrinsic quantitative relation between molecular structures and the properties of the organic compounds. Aliphatic compounds, as is known, are various with a great deal of uses in our daily life. However, a considerable part of the aliphatic compounds hasn' t yet been tested for their toxicity. For this purpose, we began to relate the properties under question to the structural parameters in hope to develop a corresponding quantitative model, believing that QSAR can be used to predict such properties of organic compounds from their molecular structures alone. In this paper, we have chosen 4 descriptors which may contribute greatly to the LC50 with a variable selection method of genetic algorithm (GA). At the same time, we have also used both the multi-linear regression (MLR) and the new chemo-informatic method in supporting the vector machine (SVM) to Simulate the likely quantitative retation lying between the above said selected descriptors and LC50. Then, we began to test the proposed models with their internal and external validations thoroughly checked. The results of our study prove the robustness and highly predictive ability as well as the de-ductive power of our generalization. The mean absolute error for the training set and prediction set of SVM model turn out to be 0. 336 and 0. 364, the results of the MLR model have thus been proved credible. Therefore, it can be concluded that our model for testing the quantitative relationship between the acute toxicity and molecular structures of aliphatic compounds is true to the testin