基于杂交原理的生物芯片技术由于实验中非特异性杂交的存在,导致实验数据包含较高的噪声。针对目前广泛使用的Affymetrix GeneChip基因芯片,通过对原始数据进行分析研究,得到探针中间3个核苷酸序列和MM探针与PM探针灰度比值的对应关系;将该对应关系嵌入到基因表达水平的计算中,以改进原有的一个先进的计算基因表达水平的概率模型mmgMOS。通过在标准的spike-in数据集和一个真实的老鼠胚胎数据集上的测试,证明改进的模型有效地提高了基因表达水平计算的精度,同时显著地提高了计算效率,在一定程度上降低了非特异性杂交引起的噪声影响。
Hybridization-based microarray technology obtains noisy data due to cross-hybridization.This paper studies the probe-level data obtained from widely used Affymetrix GeneChip and reveals the correlation between the middle three nucleotides of probes and the ratios of mm/pm.This correlation is applied in an advanced probe-level analyses model mmgMOS.Experiment results on a standard spike-in dataset and a real mouse embryo dataset show that the accuracy and efficiency of the expanded model are improved significantly and the influence of the noise caused by cross-hybridization is decreased to some degree.