转录因子结合位点识别在基因表达调控过程中起着重要的作用.文中提出了一种贝叶斯模型驱动的模体识别的遗传优化算法GOBMD(Genetic Optimization with Bayesian Model for Motif Discovery).GOBMD首先使用一个基于位置加权散列的投影过程,将输入序列中的l-mers投影到k维(k
Transcription factor binding site(TFBS) detection plays an important role in gene finding and understanding gene regulation relationship.Motifs are weakly conserved and motif discovery is a challenging problem.We propose a new approach called Genetic Optimization with Bayesian model for Motif Discovery(GOBMD).GRBMA first uses a position-weight hashing based projection,which mapping the l-mers in DNA sequences into some k-demission subspaces(kl),to find good starting candidates motifs.GOBMD then employs an effective genetic refinement to evolve the candidate motifs for further optimization.GOBMD also incorporates the Bayesian formula and relative entropy in its fitness to find the best configuration of sites locations.Experimental results on simulated data show that GOBMD can compete with Gibbs,WINNOWER,SP-STAR,PROJECTION on most implanted(l,d)-motif finding problems.We compare the performance coefficient scores for identifying(l,d)-motif finding problems by making separate box plots for each of the algorithms listed above.The experimental results on realistic biological data by identifying a number of known transcriptional regulatory motifs in eukaryotes also show that GOBMD can predict the TFBSs efficiently.