基因结构预测中的一个重要步骤是精确地识别剪接位点。基于剪接反应的基本物理原则,最大信息原理被应用到剪接反应的理论分析中,进而导出了反应自由能估计表达式。作为一个简化模型,这个表达式能被用来估计一个5′剪接区或者3′剪接区所参与的剪接反应中的自由能变化。它不但较全面地概括了各个碱基之间的关联,而且还考虑了基因组背景概率的影响。这个反应自由能表达式被用来预测了人类基因中的组成性和可变剪接位点,预测结果是令人满意的,其预测能力比得上当前的一些流行方法。这说明最大信息原理可以作为研究某些核酸-蛋白质相互作用系统(如剪接反应)的理论出发点,导出的反应自由能表达式较好地符合了剪接反应过程。
An important step in the prediction of the gene structure is to recognize the splice sites accurately. Based on the point of basic physical principle in splicing reaction, the maximum information principle is applied in theoretical analysis of the splicing reaction, and a reaction free energy expression is deduced. As a coarse-grained theoretical tool, the expression can be used to estimate the free energy change during splicing reaction involving a 5' or 3' splice site. It contains not only all kinds of dependencies among both adjacent and non-adjacent bases, but also the background probability factors. By use of the reaction free energy expression to predict constitutive and alternative splice sites in human genes, the results are satisfactory. The prediction ability of the expression is comparable with other methods. This demonstrates that it is appropriate to employ the maximum information principle as a theoretical model in the research of splicing reaction, and the reaction free energy expression corresponds well to the splicing reaction process.