现有疾病基因预测方法大多利用致病基因的各类注释信息进行预测,但仍有很多疾病没有任何注释信息。针对该问题,提出一种基于文本挖掘与功能相似性的疾病基因预测方法,通过数据挖掘获取疾病的相关基因本体术语,利用功能相似性分析基因与疾病之间的相关程度,并根据该相关程度对所有候选基因进行排序,从而识别出致病基因。测试结果显示,该方法能有效预测没有已知功能注释的致病基因。
Existing disease gene prediction methods mostly go on prediction by using various annotation information of disease gene, but there are many diseases without any known pathogenic genes or related function annotations. Aiming at this problem, this paper presents a disease gene prediction based on text mining and functional similarity. It obtains disease-related gene ontology terminology through the data mining, uses functional similarity to analyze related degree between genes and disease, sorts for all candidate genes according to the related degree to identify virulence genes. Test result shows that this method can predict virulence genes which have no known functional annotation.