根据剪接位点的核酸序列保守特征,以及邻近位点的碱基组成和关联特性,结合一对可变剪接位点之间的距离参数和受体端剪接位点前30位碱基的GC和TC含量,利用结合多样性指标的二次判别方法(IDQD),预测了人类基因组中可变和组成性内含子的供体端和受体端的剪接位点,对可变的供体端和受体端剪接位点,阈值ξ选择-2时,总的预测精度分别为87.9%和89.9%,对组成性的供体端和受体端剪接位点,阈值ξ选择-1,总的预测精度分别为92.8%和94.3%.
Based on the conservation of nucleotides at splice sites, the characteristics of base composition and base correlation in the adjacent segment sequences, the distance between alternative donor or acceptor splice sites and the content of GC and TC near splice sites, the donor and acceptor splice sites for alternative and constitutive introns are predicted by use of the method of diversity measure combined with quadratic discriminant analysis. For alternative splice sites the total prediction accuracies are 87.9% and 89.9% for donors and acceptors respectively (with the chosen threshold -2). For constitutive splice sites the total accuracy are 92.8% and 94.3% for donors and acceptors respectively (with the chosen threshold - 1 ).