A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE
Open Access
- Author:
- Han, Bing
- Graduate Program:
- Statistics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 21, 2007
- Committee Members:
- Rana Arnold, Committee Chair/Co-Chair
Naomi S Altman, Committee Chair/Co-Chair
Claude Walker Depamphilis, Committee Member
Bing Li, Committee Member - Keywords:
- Bayes false nondiscovery rate
Bayes false discovery rate
mixture model
MCMC - Abstract:
- Microarray data and other applications have inspired many recent developments in the area of large scale simultaneous inference. For microarray data, the number of simultaneous tests of dierential gene expression in a typical experiment ranges from 1,000 to 100,000. The traditional family-wise type I error rate(FWER), which is defined as the probability of at most one occurrence of type I error in all the tests, is over-stringent in this context due to the large scale of simultaneity. More recently, false discovery rate (FDR), was defined as the expected proportion of type I errors among the rejections. Controlling the less stringent FDR criterion has less loss in detection capability than controlling the FWER and hence is preferable for large scale multiple tests. From the Bayesian point of view, the posterior version of FDR and of false nondiscovery rate (FNR) is easier to study. We study Bayesian decision rules to control Bayes FDR and FNR. A hierarchical mixture model is developed to estimate the posterior probability of hypotheses. The posterior distribution can also be used to estimate the false discovery percentage (FDP) defined as the integrand of the FDR. The model in conjunction with Bayesian decision rules displays satisfying performance in simulations and in the analysis of the Aymetrix Latin Square HG-U133A spike-in data.