东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于RefSeq数据库的人类标准转录数据集的构建

ISSN号：0253-9772
期刊名称：《遗传》
时间：0
分类：Q754[生物学—分子生物学]
作者机构：[1]军事医学科学院放射与辐射医学研究所,北京100850, [2]国防科技大学机电工程与自动化学院,长沙410073, [3]北京工业大学计算机学院,北京100822, [4]军事医学科学院卫生勤务与医学情报研究所,北京100850, [5]国防科技大学并行与分布处理国防科技重点实验室,长沙410073
相关基金：国家重点基础研究发展计划（973计划）（编号：2003CB715900）、国家高技术研究发展计划（863计划）（编号：2002AA234021）、并行与分布处理国防科技重点实验室基金（编号：51484050304JB4401）、中国教育网格（ChinaGrid）生物信息学网格项目联合资助

作者：李稚锋[1,2], 李玉鉴[3], 赵东升[4], 杭兴宜[1], 王正志[2], 骆志刚[5], 张成岗[1]

关键词： RefSeq数据库, 转录组, 质量控制, 人类标准转录数据集, RefSeq database, transcriptome, quality control, database of standard transcript sequences of human

中文摘要：

美国国家生物信息技术中心（NCBI）提供了具有生物意义上的非冗余的基因和蛋白质序列的RefSeq参考序列数据库。然而，由于基因普遍存在的多态性以及不同实验室对于序列测定的质量控制存在差异等原因，已发现RefSeq数据库可能存在部分质量问题。文章基于“中心法则”提出“标准转录数据集”的概念，以人类基因和基因组序列为例，利用BLAT、Sim4和自行设计的Elparser等基因结构解析程序分析了RefSeq人类基因转录数据（2005-4-18）与目前所公布的人类标准基因组（2005-4-20）的对应关系。对于有实验证据支持的标记为NM_和NR_的记录，多种程序分析结果表明，其与标准基因组完全相对应的记录为9771个；符合多个程序修订标准的记录有10943个；而与标准基因组有较大差异的记录为203个，多种程序分析结果不一致的记录为2676个，提示研究人员在使用此非标准转录组数据时，必须考虑到其存在非标准转录的原因甚至存在错误的可能性。此文为基于标准、高质量转录数据集的生物信息学数据分析、分子生物学实验设计、基因多样性和遗传变异分析等提供了重要的参考标准。相关结果可通过http：//biocompute．bmi．ac．cn／transcr．plome／index．htm访问。

英文摘要：

The NCBI Reference Sequence （RefSeq） database aimed to provide a biologically non-redundant collection of DNA, RNA, and protein sequences and to promote the research on genes and proteins of human beings and other species. However, because of widely distributed polymorphisms and different quality control of experiments in individual laboratories, there are potential problems need to be identified in the RefSeq database. Regarding which, we herein define the concept, standard transcript, based on the Central Dogmas of Biology that each standard transcript should be perfectly mapped to the standard genomic DNA sequence at the exon level. A large scale analysis for mapping all of the RefSeq records of human being （2005-4-18） to the officially released human genome sequence database （2005-4-20） was further performed using BLAT, Sim4 and a homemade program, Elparser, which was especially designed for this purpose. The standard transcripts based on the RefSeq database were obtained according to the align- ment with standard human genome database. There are 9 771 RefSeq records of human being labeled with ＂NM_＂ and ＂NR_＂ could be perfectly mapped to human genome sequences, while other 10 943 records could be considered as standard transcripts after reasonable revision by comparing with the genome sequences according to all of the three methods. Moreover, the left 203 unrevisable records and 2 676 inconsistent records reported by the above programs could not be considered as standard transcripts and should be checked critically before using because of potential errors in them. Our study has thus provided a reference standard dataset of human beings with high quality for further bioinformatic and experimental analysis such as polymorphism and mutation of human genes. The reference standard dataset based on above criteria could be retrieved from http：//biocompute. bmi. ac. cn/transcriptome/index. htm.

同期刊论文项目

　基因功能预测的生物信息学理论与应用

期刊论文 22

同项目期刊论文

动物体内microRNAs与转录因子及剪接因子之间的相互调控

不同压缩程序对海量生物信息数据压缩效率的比较分析

PCR引物特异性核查系统(PSC)的构建与应用

真核基因可变剪接研究现状与展望

Bow-tie topological features of metabolic networks and the functional significance

基于序列保守性和蛋白质相互作用的真核蛋白质亚细胞定位预测

细胞内低氧感受器：缺氧诱导因子-1脯氨酰羟化酶研究进展

用生物信息学方法寻找肝癌特异性表达基因转录调控模式

蛋白质亚细胞定位的生物信息学研究

真核生物中的“双重编码”现象

代谢网络的蝴蝶结结构特征及其功能意义

应用进化踪迹及分子动力学模拟研究β2肾上腺素受体突变活性

秀丽线虫：低氧应答研究的模式生物

小鼠FAAP蛋白对细胞黏附的影响

GSDS：基因结构显示系统

生物系统中的非线性现象

图形聚类算法的代谢网络模块化分析

真核基因起始与终止密码子旁侧序列特征分析

人类和模式生物标准转录数据库Web服务系统“StdTransDb”的技术实现

Efficient and reproducible folding simulations of the Trp-cage protein with multiscale molecular dynamics

基于Java的基因本体工具包

期刊信息

《遗传》
中国科技核心期刊

主管单位:中国科学院
主办单位:中国遗传学会
主编：张永清
地址：北京朝阳区北辰西路1号院中国科学院遗传发育所
邮编：100101
邮箱：yczz@genetics.ac.cn
电话：010-64807669

国际标准刊号：ISSN：0253-9772
国内统一刊号：ISSN：11-1913/R
邮发代号:2-810

获奖情况:
中国自然科学核心期刊,《CAJ-CD》执行优秀奖,2008年12月获“中国精品科技期刊”证书和北京市印...

国内外数据库收录:
美国化学文摘（网络版）,英国农业与生物科学研究中心文摘,荷兰文摘与引文数据库,美国生物医学检索系统,美国生物科学数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:23270