biotechniques paper published
May 25, 2002
LL Hsiao, RV Jensen, T Yoshida, KE Clark, JE Blumenstock, SR Gullans. Correcting for signal errors in the analysis of microarray data Biotechniques 32:2 (Feb. 2002): 330-337
Abstract
A variety of technical errors have risen in data analysis when using cDNA or oligonucleotide microarrays. One of the most insidious problems is saturation of hybridization signal of high abundant transcripts. This problem arises due to the truncation of laser fluorescence signal. When the hybridization signal on the microarray is very strong, this truncation has resulted in serious consequences that may not be readily apparent to the user. As an illustration of this problem, two subclasses of normal human tissue samples (six liver and six lung) were analyzed with Affymetrix GeneChip® probe arrays to evaluate patterns of expression for ~7000 human genes. Five of these datasets were found to suffer from signal truncation. This caused several tissues to be incorrectly classified using hierarchical clustering. To rectify this problem so that the gene expression data could be properly compared and clustered, we developed a "filtering" procedure that identifies a subset of genes least affected by the signal saturation (The filtering procedure can be obtained at www.hugeindex.org).
Abstract
A variety of technical errors have risen in data analysis when using cDNA or oligonucleotide microarrays. One of the most insidious problems is saturation of hybridization signal of high abundant transcripts. This problem arises due to the truncation of laser fluorescence signal. When the hybridization signal on the microarray is very strong, this truncation has resulted in serious consequences that may not be readily apparent to the user. As an illustration of this problem, two subclasses of normal human tissue samples (six liver and six lung) were analyzed with Affymetrix GeneChip® probe arrays to evaluate patterns of expression for ~7000 human genes. Five of these datasets were found to suffer from signal truncation. This caused several tissues to be incorrectly classified using hierarchical clustering. To rectify this problem so that the gene expression data could be properly compared and clustered, we developed a "filtering" procedure that identifies a subset of genes least affected by the signal saturation (The filtering procedure can be obtained at www.hugeindex.org).
