随机的上下文无关文法(SCFG ) 被用于预言 RNA 第二等的结构。RNA 第二等的结构的预言能被与比较顺序分析合并便于。然而,大多数存在基于 SCFG 的方法缺乏相应 RNA 序列的明确的系统发生的分析,它可能是这些方法不在实际申请是理想的原因。因此,我们在场由与最新定义的侧面 SCFG 集成系统发生的分析的一个新基于 SCFG 的方法。方法能是被总结:1 ) 我们定义新侧面 SCFG, M,描绘一致多重 RNA 的第二等的结构顺序对准;2 ) 我们介绍二不同隐藏的 Markov, λ 和 λ′ ,执行相应 RNA 序列的系统发生的分析。这里, λ 为顺序和 λ′ 的非结构的区域为顺序的结构的区域;3 ) 我们合并 λ 和 λ′进 M 到为 RNA 的预言设计一个联合模型第二等的结构。我们在从 Rfam 数据库构造的数据集合上测试了我们的方法。我们的方法的敏感和特性是比由 Pfold 的预言的那些更精确的。这篇文章(doi:10.1007/s11390-008-9154-7 ) 的联机版本包含增补材料,它对授权用户可得到。
Stochastic context-free grammars (SCFGs) have been applied to predicting RNA secondary structure. The prediction of RNA secondary structure can be facilitated by incorporating with comparative sequence analysis. However, most of existing SCFG-based methods lack explicit phylogenic analysis of homologous RNA sequences, which is probably the reason why these methods are not ideal in practical application. Hence, we present a new SCFG-based method by integrating phylogenic analysis with the newly defined profile SCFG. The method can be summarized as: 1) we define a new profile SCFG, M, to depict consensus secondary structure of multiple RNA sequence alignment; 2) we introduce two distinct hidden Markov models, λ and λ', to perform phylogenic analysis of homologous RNA sequences. Here, λ' is for non-structural regions of the sequence and λ' is for structural regions of the sequence; 3) we merge λ and λ' into M to devise a combined model for prediction of RNA secondary structure. We tested our method on data sets constructed from the Rfam database. The sensitivity and specificity of our method are more accurate than those of the predictions by Pfold.