随着生物技术的不断发展和系统发育学的深入研究,在重构系统发育树时,研究人员往往要面对更多的挑战和困难,比如:(1)需要分析的样本数(物种数或个体数)不断增加;(2)需要分析的数据量迅速扩大。尤其在基因组测序技术的推动下,基于分子信息的系统发育重建需要极大的计算量,因此数学方法、计算机技术以及其他辅助工具对于系统发育重建的效率和精确度起着至关重要的作用。最大简约法(maximum parsimony)是一种重要的系统发育重建方法,提高其计算效率对系统发育学研究具有重要意义,针对该算法的优化改进需要生物学家和计算机专家的共同努力。本文通过详细地阐述最大简约法的计算流程,分析其参数选择对计算效率的影响,帮助更多的计算机使用者,在并不了解系统发育学基础的情况下,更方便地针对实际的系统发育算法问题给出更好、更快、更精准的解决方案;同时为系统发育研究工作者,较为清晰地解释最大简约法的构树思想和计算逻辑,推动针对最大简约法的不断改进与优化。
With the continuous development of biotechnoglogy and progresses in phylogenetics, researchers now are facing more and more challenges and difficulties in reconstructing phylogenetic trees: 1 ) species number (or individual number) of the specific taxon of research is always increasing; 2) the number of taxonomical characters ( for example molecular information) of each species ( or individual) is also enlarging. Especially with the efforts of genome-sequencing technology, phylogenetic reconstruction based on molecular information requires massive computation. Mathematical methods, computer technologies and other auxiliary means play key roles in enhancing the efficiency and accuracy of phylogenetic reconstruction. Maximum parsimony (MP) is a very important method for phylogenetic reconstruction, and it needs efforts of both biologists and computer scientists to enhance its computational efficiency. In this article, we elaborated the calculation procedure of the MP method in details and analyzed the influences of parameter selection on computational efficiency, in order to help more computer researchers without detailed knowledge of phylogenetics to present better, quicker and more precise solutions to phylogenetic reconstruction in practice. In the meantime, we tried to explain the basic principles and computational logic of the MP method for phylogenetic researchers to push forward continuous improvement and optimization of using maximum parsimony in biology.