我们发展了一种用于预测有机小分子化合物水溶解度(logS)的经验方法XLOGS.它本质上是一种加合模型,采用83种原子/基团类型和3个校正因子作为模型的描述符.该方法还可以根据一个合适的参照分子的logS实验值来计算未知化合物的logS值.我们将XLOGS模型在由4171个化合物组成的训练集上进行了参数化,多元线性回归获得的相关系数(R2)和标准偏差(SD)分别为0.82和0.96单位.将该训练集进一步分为仅含液体化合物和仅含固体化合物的两个子集.XLOGS模型在这两个子集上的回归结果显示前者优于后者(标准偏差分别为0.65单位和0.94单位).还利用log1/S和logP(脂水分配系数)之间的差值来研究XLOGS方法在液体和固体化合物数据集上的表现.研究结果表明:XLOGS等加合法模型更适合应用于这两者差值接近于0的化合物.还将XLOGS和其它三种流行的logS计算模型(包括Qikprop,MOE-logS和ALOGPS)在一个含有132个类药化合物的独立测试集上进行了比较.总体而言,我们的研究结果为加合法模型在水溶解度预测方面的合理应用提供了指导.
We have developed a new empirical model,namely XLOGS,for computing aqueous solubility (logS) of organic compounds.This model is essentially an additive model,which employs a total of 83 atom/group types and three correction factors as descriptors.Furthermore,it computes the logS value of a query compound by using the known logS value of an appropriate reference molecule as a starting point.XLOGS was calibrated on a training set of 4171 compounds with known logS values.The squared correlation coefficient (R 2) and standard deviation (SD) in regression were 0.82 and 0.96 log units,respectively.The entire training set was further split into one subset containing liquid compounds only and another subset containing solid compounds only.Regression results of XLOGS were obviously better on the former subset (SD=0.65vs 0.94log units).The difference between log1/S and logP (partition coefficient,the ratio of concentrations of a compound in a mixture of water and n-octanol at equilibrium) was used as an indicator to investigate the performance of XLOGS on liquid compounds and solid compounds.Our results suggested that an additive model like XLOGS performed most satisfactorily when this difference was close to zero.Three other logS models,including Qikprop,MOE-logS,and ALOGPS,were also compared with XLOGS on an independent test set of 132 drug-like compounds.Put together,our study provides some general guidance for applying additive models to computation of aqueous solubility.