Faculty, Staff and Student Publications
Publication Date
6-1-2024
Journal
The Annals of Applied Statistics
Abstract
Environmental exposures such as cigarette smoking influence health outcomes through intermediate molecular phenotypes, such as the methylome, transcriptome, and metabolome. Mediation analysis is a useful tool for investigating the role of potentially high-dimensional intermediate phenotypes in the relationship between environmental exposures and health outcomes. However, little work has been done on mediation analysis when the mediators are high-dimensional and the outcome is a survival endpoint, and none of it has provided a robust measure of total mediation effect. To this end, we propose an estimation procedure for Mediation Analysis of Survival outcome and High-dimensional omics mediators (MASH) based on sure independence screening for putative mediator variable selection and a second-moment-based measure of total mediation effect for survival data analogous to the R2" role="presentation" style="box-sizing: inherit; display: inline-block; line-height: 0; font-size: 18.08px; font-size-adjust: none; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; margin: 0px; padding: 1px 0px; color: rgb(33, 33, 33); font-family: BlinkMacSystemFont, -apple-system, "Segoe UI", Roboto, Oxygen, Ubuntu, Cantarell, "Fira Sans", "Droid Sans", "Helvetica Neue", sans-serif; position: relative;">R2R2 measure in a linear model. Extensive simulations showed good performance of MASH in estimating the total mediation effect and identifying true mediators. By applying MASH to the metabolomics data of 1919 subjects in the Framingham Heart Study, we identified five metabolites as mediators of the effect of cigarette smoking on coronary heart disease risk (total mediation effect, 51.1%) and two metabolites as mediators between smoking and risk of cancer (total mediation effect, 50.7%). Application of MASH to a diffuse large B-cell lymphoma genomics data set identified copy-number variations for eight genes as mediators between the baseline International Prognostic Index score and overall survival.
Keywords
high-dimensional mediators, mediation analysis, survival analysis, total mediation effect, variable selection
Included in
Bioinformatics Commons, Biomedical Informatics Commons, Medical Sciences Commons, Oncology Commons
Comments
Associated Data
PMID: 39328363