比较基因组杂交技术(comparative genomic hybridization,CGH)主要用于检测肿瘤的染色体缺失和扩增,迄今已积累了大量的实验数据,为全基因组分析肿瘤的发生机制提供了可能。树模型在生物信息学领域通常被用于研究生物形成和进化的历史,物种之间的进化关系常以系统发生树来表示。树模型同样可以作为一种有力的生物信息学工具来分析CGH数据,探索癌症的发病机理。文中介绍了两种常见的树模型——分支树和距离树,详细叙述了重建树模型的基本原理和方法,分析了创建树模型时要注意的几个技术问题,并对其在肿瘤研究中的应用进行了回顾和总结。肿瘤的树状模型作为单路径线性模型的泛化,克服了以往单路径线性模型的缺点,理论上能更加精确地概括到肿瘤的多基因、多路径、多阶段的发生发展模式,从不同角度探讨肿瘤发生发展的分子机制。该模型除可用于分析肿瘤的CGH数据外,还可用于分析其他多种类型的数据,包括微阵列CGH(array-CGH)技术等产生的高分辨率数据。
Comparative genomic hybridization (CGH) can detect chromosomal deletions and amplifications of tumors, and various laboratories and public databases have accumulated a large number of CGH data, providing the opportunity to analyze the molecular mechanism of tumorigenesis in the whole genome. Tree models are generally used to study the history of biological formation and evolution in the field of bioinformatics, and evolutionary relationships between species are usually represented using phylogenetic tree. Tree models are also powerful bioinformatics tools to analyze CGH data and explore carcinogenesis. Two common tree models, the branching tree and the distanced-based tree, as well as their basic principles, methods are introduced detailedly, several technical problems in construction of tree models are discussed, and their applications in cancer research are reviewed systematically in this paper. As a generalization of single path linear model, tree models can more accurately conclude multigene, multistep, multipathway process of tumorigenesis, exploring the molecular mechanism of tumorigenesis from different angels. Apart from CGH data, tree models can be used to analyze various types of data, including high-resolution data (e.g., array-CGH data).