为了识别番茄基因组中潜在的miRNA,基于已发现的miRNA特征,利用支持向量机方法构建模型sly_pre_SVM和sly_SVM,用于番茄的前体miRNA序列和成熟miRNA序列的预测。对miRNA特征向量的编码、miRNA特征选择和参数的优化进行了研究。sly_pre_SVM对番茄测试集的分类精度、敏感性和特异性分别为99.69%、100%和99.66%,sly_SVM对番茄测试集的分类精度、敏感性和特异性分别为89.79%、88.89%和90%。预测得到41条番茄成熟miRNA序列,其中14条是尚未发现的,为进一步的miRNA生物学实验奠定了基础。
In order to predict the potential miRNA in tomato genome,based on miRNA feature that has been discovered,it develops two models,sly_pre_SVM and sly_SVM,based on support vector machine to discover the miRNA precursor sequence and mature miRNA sequence of the tomatoes.Some research about vector coding of miRNA feature,miRNA features selection and parameters optimization is done.The accuracy,sensitivity and specificity of sly_pre_SVM,a model applied to predict miRNA precursors,is 99.69%,100% and 99.66% on tomato data set.The accuracy,sensitivity and specificity of sly_ SVM,a model applied to predict mature region on miRNA precursors,is 89.79%,88.89% and 90.0% on tomato data set.14 novel miRNA candidates are obtained from tomato genome.Therefore,the research provides guidance for further miRNA biology experiment.