用Perl语言编写了一个pep_pattern.pl脚本,可从一组相关序列中搜索非常相似的序列片段,通过匹配所有可能的氨基酸片段的排列,统计每个匹配模体在序列中的出现频率和位置,搜索蛋白质序列中的2~4个多肽的模体。
A motif is a sequence pattern occurring repeatedly in a group of related DNA or protein sequences, and is an important concept for describing the common structure and function shared by the members of a protein family. However, the motif can be quite complex and is often difficult to predict the pattern of amino acid sequence. To get the desired results of the short motifs (2-4 polypeptides) derived from various bioinformatics is still a difficult task. The pep_pattern, pl can be used to solve this problem and provide a convenient set of Perl script for working with biological sequence motif. A Perl script pep_ pattern, pl was written for searching very similar amino acid sequence pattern or motif in a group of re- lated protein sequences by matching all the possible amino acids fragments permutation and counting frequency and position of each motif matched in sequence.