最近我们利用新一代测序(next generation sequencing,NGS)技术对肝细胞癌(hepatocellular carcinoma,HCC)患者活检标本及正常对照肝组织样品进行高通量RNA测序(RNA-sequencing,RNA-Seq),在肝癌样品中染色体11q13.1区域检测到几个相邻的RNA—Seq信号峰,而在正常对照组织中没有检测到,且该染色体区域目前尚无已知基因登录,提示这几个RNA-Seq峰可能代表一个或多个未知的新基因.以此为线索,证实这几个RNA-Seq峰来自周一个新基因,并克隆了该基因全长序列,在克隆该基因全长序列时,发现该基因编码的RNA存在多种剪接形式,最长的转录本为3562bp.将该基因编码的12条代表性RNA转录本序列递交到美国国立生物技术信息中心(National Center for Biotechnology Information,NCBI)的GenBank数据库中,GenBank ID号分别为KC136297~KC136308.该基因编码的RNA没有发现明显的开放阅读框(open reading fiagment,ORF),提示该基因可能编码长链非编码RNA(10ng non-coding RNA,lncRNA).为了探讨该lncRNA基因可能的转录调控机制,我们用生物信息学方法预测了该lncRNA基因潜在启动子区域,发现在其转录起始位点上游-719~-469bp处有一个潜在的启动子,其中包含7个Sp1、1个STAT5和1个EGRl转录因子结合位点.该incRNA在肝细胞癌发生发展过程中的作用机制值得进一步深入研究.
Recently, we sequenced the transcriptomes of a hepatocellular carcinoma biopsy and a normal liver tissue using the RNA-Sequencing (RNA-Seq) strategy based on the Next Generation Sequencing (NGS) technique, and identified several adjacent high RNA-Seq signal peaks on chromosome llq13.1 in the hepatocellular carcinoma biopsy, while not in the normal control tissue. In this chromosome region, there is no characterized genes have been identified, implying that these RNA-Seq peaks may represent one or more novel genes. Further study was confirmed that these RNA-Seq peaks were transcribed by one novel gene. Through cloning the full length of this novel gene, we found that this novel gene transcribed many splicing isoforms, and the longest isoform is 3 562 bp. Then we deposited twelve representative RNA isoforms into the GenBank database of the National Center for Biotechnology Information (NCBI), and created the GenBank IDs from KC136297 to KC136308 for these isoforms. None significant open reading fragment (ORF) was found in any transcripts of this novel gene, implying that this gene may encodes long non-coding RNAs (lncRNAs). To further elucidate the potential transcriptional regulation mechanism of this lncRNA gene, we predicted the promoter from the upstream sequence of the lncRNA gene using bioinformatic tools, and found that there is one potential promoter in -719 to -469 bp from the transcript start site of the lncRNA gene, and there are seven Spl, one STAT5 and one EGR1 transcription factor binding sites in the promoter region. The molecular mechanisms of the lncRNA gene in carcinogenesis and progression ofhepatocellular carcinoma are worthful for further investigation.