由于引入各种内在近似, 密度泛函理论存在固有误差. 本文采用O3LYP/6-311+G(3df, 2p)//O3LYP/6-31G(d)计算了220个中小型有机分子的生成热(ΔfHcalcΘ), 随后应用神经网络(ANN)和多元线性回归(MLR)方法对ΔfHcalcΘ进行校正. 采用计算得到的生成热、零点能、分子中原子总数、氢原子个数、双中心成键电子数、双中心反键电子数、单中心价层孤对电子数、单中心内层电子数作为ANN和MLR的描述符. 以180个分子作为训练集构造ANN或MLR模型, 并对40 个独立测试集分子的ΔfHcalcΘ进行了预测. 结果表明: 经过ANN和MLR校正后,训练集分子生成热的理论计算值和实验值间的均方根偏差(RMSD)从24.7 kJ·mol-1分别降低到11.8、13.0 kJ·mol-1; 独立测试集分子的RMSD从21.3 kJ·mol-1分别降低到10.4、12.1 kJ·mol-1. 因此ANN模型的拟合和预测能力要明显优于MLR模型.
The results of density functional theory calculations are known to contain inherent numerical errors caused by various intrinsic approximations. In this paper, O3LYP/6-311+G(3df,2p)//O3LYP/6-31G(d) calculations were used to derive the heats of formation (ΔfHcalcΘ) of 220 small to medium-sized organic molecules, followed by the application of artificial neural network (ANN) and multiple linear regression (MLR) analyses to correct the values. The physical descriptors chosen were ΔfHcalcΘ and zero point energy as well as the total quantities of atoms, hydrogen atoms, 2-center bonds, 2-center antibonds, 1-center valence lone pairs and 1-center core pairs. The ANN and MLR systems were initially constructed using a 180 training set. The trained ANN and MLR systems were subsequently used to predict values of ΔfHcalcΘ for a 40 individual testing set. The results demonstrated that the root mean square (RMS) deviations between the calculated and experimental ΔfHΘ values in the training set were reduced from 24.7 to 11.8 and 13.0 kJ·mol-1 after ANN and MLR corrections, respectively. For the individual testing set, the deviations (RMSD) were reduced from 21.3 to 10.4 and 12.1 kJ·mol-1, respectively. Based on these results, it can be concluded that ANN exhibits superior fitting and predictive abilities compared with MLR.