中医方药呈现多靶点、多成分、多药效指标等特点,决定了中医药数据具有多自变量、多因变量和非线性的特征。偏最小二乘法(PLS)以其内部交叉核验的本质,难以满足中医药非线性的特性,而模型树在回归建模时,由多个多元线性片段组成,对非线性数据有良好的拟合效果。基于此,本文提出了一种融合模型树的PLS。PLS外模型中的主成分仍按照原来的方法不断提取并累计t=(t1,t2,t3,…),将这些主成分分别与原始被解释变量不断构建模型树,直到满足精度条件为止。分别在麻杏石甘汤君药平喘实验、止咳实验和UCI机器学习数据集上进行实验,结果表明,融合模型树的PLS对中医药数据有很好的适应性。
Traditional Chinese medicines( TCM) present features of more compositions,more targets and more efficacies. Therefore, the collected data of TCM exist multi-components, multi-targets and nonlinear characteristics. Partial least square( PLS) can't adapt to the characteristics of the TCM data due to its own nonlinear regression. However,model tree( MT),which is made up of many multiple linear segments,has a good fitting effect to nonlinear data. Based on this,a new method combining PLS and MT to analysis and predict the data is proposed,employ native PLS method to extract main ingredients continually and accumulate it,then build Model Tree through the main ingredients and the original explanatory variables one step by step,until the precision requirements are met. Using the data of the maxingshigan decoction of the monarch drug to treat the asthma or cough and five sample sets in the UCI machine learning repository,the experimental results showed that the PLS and model tree have good adaptability for the TCM data.