
Faculty, Staff and Student Publications
Publication Date
10-25-2022
Journal
Nature Communications
Abstract
Genes with moderate to low expression heritability may explain a large proportion of complex trait etiology, but such genes cannot be sufficiently captured in conventional transcriptome-wide association studies (TWASs), partly due to the relatively small available reference datasets for developing expression genetic prediction models to capture the moderate to low genetically regulated components of gene expression. Here, we introduce a method, the Summary-level Unified Method for Modeling Integrated Transcriptome (SUMMIT), to improve the expression prediction model accuracy and the power of TWAS by using a large expression quantitative trait loci (eQTL) summary-level dataset. We apply SUMMIT to the eQTL summary-level data provided by the eQTLGen consortium. Through simulation studies and analyses of genome-wide association study summary statistics for 24 complex traits, we show that SUMMIT improves the accuracy of expression prediction in blood, successfully builds expression prediction models for genes with low expression heritability, and achieves higher statistical power than several benchmark methods. Finally, we conduct a case study of COVID-19 severity with SUMMIT and identify 11 likely causal genes associated with COVID-19 severity.
Keywords
Humans, Transcriptome, Genome-Wide Association Study, COVID-19, Quantitative Trait Loci, Multifactorial Inheritance, Polymorphism, Single Nucleotide, Genetic Predisposition to Disease
DOI
10.1038/s41467-022-34016-y
PMID
36284135
PMCID
PMC9593997
PubMedCentral® Posted Date
10-25-2022
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Included in
Bioinformatics Commons, Biomedical Informatics Commons, Genetic Phenomena Commons, Medical Genetics Commons, Oncology Commons