可变剪接是遗传信息传递与表达的重要环节,真核生物多外显子基因pre-m RNA的可变剪接现象非常普遍。可变剪接增加了蛋白质的多样性,影响诸如细胞组织特异性分化、个体发育、疾病发生等重要的生物学过程。基于内含子的k-mer(k=1…5)信息,利用多样性增量结合二次判别的方法,对保留型内含子和组成型内含子进行了区分。五折交叉检验显示的预测总精度、敏感性和特异性均大于70%。表明内含子的序列信息可能对保留型内含子的剪接有重要调控作用。
The alternative splicing is widespread in eukaryotes and is crucial for genetic information transduction and gene expression. Alternative splicing greatly expands the diversity of proteins and has effect on many biological processes such as tissue-specific differentiation,development diseases, and so on. Based on k-mer(k=1…5) information of intron, the method of increment of diversity combinied with quadratically integrable function was used to distinguished constitutive intron versus retained intron. Five-fold cross-validation demonstrated that the total accuracy, sensitivity, and specificity are higher than 70%. This result indicates that sequence information may have influence on the regulation of alternative splicing of retained intron.