Error Analysis and Modeling of DNA Microarrays

In order to determine whether measured mRNA levels or other global data are consistent with a particular model of a cellular process, we need to have some notion of how much measurement error and biological variation can influence the measurements. Accordingly, we have developed refined statistical procedures, based on maximum-likelihood estimation, to determine which changes observed with a DNA microarray are significant and which are likely due to error. These methods are available for download (Windows, Linux, and SunOS platforms) through the public-domain software package VERA and SAM.

VERA estimates the parameters of a statistical model that describes multiplicative and additive errors influencing an array experiment, using the method of maximum likelihood. SAM gives a value, lambda, for each gene on an array, which describes how likely it is that the gene is expressed differently between the two cell populations. A large value of lambda means that the gene is almost certainly expressed differentially, while a small value (close to 0) indicates that there is no evidence for differential expression.