Integrative Bayesian Variable Selection (iBVS) for Biomarker Discovery
A primary objective of personalized medicine (Jain, 2009) is the development of molecular biomarkers for effective disease diagnosis, treatment, and prevention. The use of high-throughput technologies (e.g., microarray and next-generation-sequencing techniques) facilitates innovative findings, though complicates discovery. A primary challenge of biomarker discovery lies in the derivation of predictable and reliable biomarkers from large numbers of candidates (e.g., thousands of SNPs and hundreds of genes); these discovery studies are usually hampered by relatively small sample sizes. Funded by National Institute of Child Health and Human Development, we have developed an integrative Bayesian variable selection (iBVS) strategy for microarray gene expression data with a binary clinical outcome variable. This strategy follows the philosophy of systems biology, providing for direct incorporation of known gene regulatory networks into the prior distribution for model parameters to guide the procedure of gene selection. Currently, we are developing an R package for iBVS to allow other forms of Omics data (e.g., SNPs, RNA-Sequence, and DNA Methylation) with binary (e.g., case vs control), categorical (e.g., subtypes of leukemia) continuous (e.g., BMI score), count (e.g., certain blood cell counts), and time-to-event (e.g., 5-year survival) clinical outcome or phenotype traits.