Current approaches to identifying differentially expressed genes are based either on the fold changes or on the traditional hypotheses of equality. However, the fold changes do not take into consideration the variation in estimation of the average expression. In addition, the use of fold changes is not in the frame of hypothesis testing, and hence the probability associated with errors for decision making of identification of differentially expressed genes cannot be quantified and evaluated. On the other hand, the traditional hypothesis of equality fails to take into consideration the magnitudes of the biologically meaningful fold changes that truly differentiate the genes between populations. Because of the large number of genes tested and small number of samples available for microarray experiments, the false positive rate for differentially expressed genes is quite high and requires further multiplicity adjustments, or use of an arbitrary cutoff for the p-values. However, all these adjustments do not have any biological justification. Hence, based on the interval hypothesis, Liu and Chow proposed a two one-sided tests procedure by consideration of both the minimal biologically meaningful fold changes and statistical significance simultaneously. To incorporate the correlation structure of expression levels among different genes and possible violation of normality assumption, we propose to apply a permutation method to the two one-sided tests procedure. A simulation study is conducted to empirically compare the type I error rate and power of the procedures based on the traditional hypothesis and the proposed permutation two one-sided tests procedure based on the interval hypothesis under various combinations of fold changes, variability, and sample sizes. Simulation results show that the proposed permutation two one-sided tests procedure based on the interval hypothesis not only can control the type I error rate at the nominal level but also provide adequate power to detect differentially expressed genes. Numerical data from public domains illustrate the proposed methods.
Date:
2008-09
Relation:
Journal of Biopharmaceutical Statistics. 2008 Sep;18(5):808-826.