目的探讨处理复杂数据存在多个变量区块情形的一种统计分析方法:多区块偏最小二乘回归(MBPLSR),并将其用于环境-食品重金属迁移研究之中。方法将重金属镉从环境向大米迁移的影响因素,划分为土壤理化特性与各态镉含量两类,运用MB-PLSR建立环境-大米镉转移模型,并且与传统偏最小二乘回归(PLSR)进行性能比较。结果 MB-PLSR较好地利用变量区块的先验信息,使得其无论是在数据拟合、预测性能方面,还是在维度压缩方面,均优于PLSR。结论 MB-PLSR适用于具有变量区块的复杂数据建模,具有较好的信息综合和解释能力。
Objective To explore multiblock partial least squares regression( MB-PLSR) that deal with multiple variable blocks in complex data,and apply this statistical method to modeling environment-food heavy metal transfer. Methods The influence factors of cadmium( Cd) transfer from environment to rice w ere divided into tw o blocks: soil physical-chemical variable block and multi-state Cd variable block. M B-PLSR w as used for modeling environment-food Cd transfer,and w as compared w ith classical partial least squares regression( PLSR) in their performance. Results In terms of the dimensional reduction,model prediction and interpretation,M B-PLSR is superior to PLSR. Conclusion As a practical statistical method of soft modeling for handling complex data w ith multiple variable block structure,M B-PLSR has several technical advantages in information extraction and model interpretability.