Incorporating variant prioritization in test of de novo association

Hao Hu, The University of Texas School of Public Health

Abstract

De novo mutations in the human genome have been linked to many diseases. Most existing de novo mutation tests consider the number of mutations in a gene that meet a specific classification without regard to the potential differences in impact on the structure and function of the protein. I introduce the VARiant PRIoritization SuM (VARPRISM) software package, which incorporates functional variant prioritization information to improve the power to detect de novo mutations influencing disease risk. VARPRISM evaluates the consequence of any given exonic mutation on the protein sequence to estimate the likelihood that the mutation is benign or damaging. Then at the gene level, VARPRISM conducts a likelihood ratio test by calculating the joint-likelihood of all observed mutations. VARPRISM considers all protein-coding single-nucleotide mutations, small insertions, and small deletions in a combined statistical test. In theoretical simulations, VARPRISM exhibited higher statistical power than fitDNM and two Poisson-based tests and was robust to mis-specified distributions of amino acid substitutions. I analyzed the Simons Simplex Collection of 2,508 parent-offspring autism trios using VARPRISM, replicating 44 previously identified autism susceptibility genes and identifying 20 additional candidate genes, including MYO1E, KCND3, PDCD1, DLX3, and TSPAN4 (FDR < 0.3). These candidate genes were significantly over-represented in 6 functional gene classes, including embryonically expressed genes, essential genes (p = 5.6x10e-4), chromatin modifiers (p = 7.2x10e -5), fragile X mental retardation (p = 8.5x10e-10), schizophrenia (p = 4.0x10e-3), and intellectual disability (p = 1.1x10e-9). In conclusion, by incorporating variant prioritization information, VARPRISM achieved improved statistical power for finding genes with increased de novo mutation load among affected individuals, in both simulated datasets and a real-world autism whole-exome sequencing dataset. VARPRISM is capable of simultaneously analyzing single nucleotide mutations, small insertions and small deletions.

Subject Area

Biostatistics

Recommended Citation

Hu, Hao, "Incorporating variant prioritization in test of de novo association" (2016). Texas Medical Center Dissertations (via ProQuest). AAI10195888.
https://digitalcommons.library.tmc.edu/dissertations/AAI10195888

Share

COinS