Data Normalization and Transformation
Normalization, and as a special form of normalization standardization, can help us separate true variation from differences due to experimental variability. This step might be necessary since it is quite possible that due to the complexity of creating, hybridizing, scanning and quantifying microarrays variation originating from the experimental process itself contaminated the data. The various methods of normalization aim at removing or at least minimizing expression differences due to any kind of contamination.
Scatter plot of two experiments where the signal values are significantly stronger in one experiment then in the other.
The same data is plotted here after normalization. Normalization was performed by subtracting the mean value and dividing by the standard deviation in both experiments. Notice that most of the gene expression values now lay much closer the identity-line as it would be expected for two experiments on the same set of genes.
Histograms of a gene expression data set (600 genes x 21 experiments). The x-axis indicates the expression values and the y-axis shows the number of genes with a particular gene expression level. The left and right panels show the data before and after the log-transform respectively. Notice how much closer is the right panel to normal distribution then the left.