如何从连锁定位区域中的众多基因中有效选取疾病候选基因是疾病诊断治疗和预防的基础。基于基因功能注释信息,设计和实现了一种新的基于基因功能相似性的疾病基因预测工具DGP,分析候选基因和已知疾病基因的GO之间的相似性,对候选疾病基因进行打分排序。从OMIM数据库中提取一个包含1 045个已知疾病基因、涉及305种疾病的数据集来测试DGP的性能,其中56.7%的疾病基因在候选基因中排名前5%,68.5%的疾病基因位于前10%,结果显示DGP具有很高的准确率,能够从某个染色体区间中有效地识别出疾病基因。
Identifying disease genes is essential for elucidating pathogenesis and developing diagnosis and prevention mea-sures.This paper developed a computational tool,named DGP,to assess candidate genes in interested chromosome regions for their possibility relating to a given disease.DGP prioritized the candidate genes by measuring the functional similarity to the known causative genes of the disease.It evaluated the performance of DGP with a dataset containing 1045 genes related to 305 diseases.The validation results show that 56.7% and 68.5% of disease-associated genes are at the top 5% and top 10% of the list prioritized by DGP.Therefore,DGP can effectively help the selection of candidate genes in interested chromosome regions for mutation analysis.