针对传统生物数据分析方法无法高效处理规模不断增大的生物语义数据集的现状,将基于属性共现的节点相似度算法应用于ChEMBL数据集,构建基于药物天然产物-活性的二部图模型,应用Graphlab框架计算基于活性特征的药物天然产物相似度,并对相似度较高的药物天然产物进行活性推荐。实验结果表明,该方法能有效利用生物数据集的语义信息发现药物天然产物潜在的活性特征,从而指导药物研发早期的活性探测以及药物靶标的发现和选择过程。
For the reason that the traditional biological analysis method is unable to handle the semantic information effectively,this paper applies the node similarity algorithm based on attribute co-occurrence to ChEMBL database and constructs the bipartite graph based on nautral product and activity.Then,with the framework of Graphlab,it calculates the natural product similarity based on activity and recommends the natural products with high similarity.Experimental results show that the method can effectively use the semantic information of biological datasets to find out the potential activities of natural products,thus guiding the activity detection and drug target discovery and selection in the early stage of drug research.