2003年,人类基因组计划的完成,宣告了后基因组时代的到来.研究人员惊奇地发现,组成人类基因组的基因只有25000个左右,比之前估计的要少得多.2005年,美国Science在其创刊125周年之际,把"为什么人类基因会如此之少"列为21世纪125个最具挑战性的科学前沿问题的第3位,并进行了专门评述.本文结合蛋白编码基因的表达调控及非编码RNA研究新进展对其进行解读.
In the past fifty years, biologists have begun to estimate protein-coding capacity of human genomes and the estimated human gene number fluctuated in a shrunken trend, ranging from two million to 25000 recognized by the Human Genome Project. This number fell to 19000 in recent studies, which suggested that human genes were even less than the nematode worm Caenorhabditis elegans. Apparently, the complexity and flexibility of higher mammal genomes are far more underestimated than they were once considered, which cannot be merely interpreted as the protein-coding gene counts. Scientists now hold the belief that the widening differences among higher organisms are primarily caused by the regulation of gene expression at the molecular levels, including transcriptional regulation and post-transcriptional regulation. With regard to human genome, two major strategies are for these processes. One is through the alternative splicing of exons and introns of pre-m RNAs transcribed from human genome, one gene may produce multiple protein isoforms, thus greatly increased the complexity of proteome. The phenomenon, over the past years, has unambiguously become one of the main reasons why human genome manifests such complexity with so few protein-coding genes. The second, there actually exist an enormous amount of active non-coding RNAs(nc RNAs) from non-protein coding regions that account for approximately 98% of the human genome, which form a highly intricate RNA regulatory network to make human genome more complicated. With the implementation of the encyclopedia of DNA elements(ENCODE) project, biologists surprisingly find that the nc RNA species are diverse, including sno RNAs, micro RNAs, pi RNAs, lnc RNAs and circ RNAs. They take part in maintaining the whole genetic information, regulating gene expression and constituting functional complexes in cells. Besides, novel classes of nc RNAs and various cis-RNA elements are expected to be discovered and identified. All together raise the fact that the human genome can be