Stepwise forward multiple regression for complex traits in high density genome -wide association studies
Many genome-wide association studies (GWAS) have been initiated for detecting genetic effects on complex diseases or traits. Methods to resolve the problems of multiple testing and dependence among test statistics in GWAS have been developed, but most currently used methods are based on separate single-nucleotide polymorphism (SNP) analyses. This research jointly analyzes data from multiple SNPs in GWAS. In simulation studies from a 115K SNP data set, methods based on separate SNP analyses were found to require either too stringent criteria to detect weak genetic effects or yield an excess of false positive results. To increase the power of detecting multiple weak genetic factors and reduce false positive results caused by multiple tests or dependence among test statistics, a modified stepwise forward multiple regression (SFMR) approach is proposed. This approach detects multiple genetic factors instead of testing univariate genetic factors for all SNPs. Simulation studies showed that for detecting weak genetic effects, SFMR has at least 23% higher power than the Bonferroni and false discovery rate (FDR) procedures and that SFMR retains an acceptable type I error rate no matter whether causal SNPs are correlated with many SNPs in the genome and how strong the causal effects are; SFMR has a higher power and a lower familywise error rate than Bonferroni and FDR procedures when the same significance criterion is used, especially when causal SNPs are correlated with many SNPs throughout the genome; for detecting strong genetic effect, SFMR has a lower familywise error rate than the Bonferroni and FDR procedures when causal SNPs are correlated with many SNPs across the genome.
Gu, Xiangjun, "Stepwise forward multiple regression for complex traits in high density genome -wide association studies" (2007). Texas Medical Center Dissertations (via ProQuest). AAI3283834.