Date of Graduation
8-2023
Document Type
Dissertation (PhD)
Program Affiliation
Quantitative Sciences
Degree Name
Doctor of Philosophy (PhD)
Advisor/Committee Chair
Christine B. Peterson
Committee Member
Peng Wei
Committee Member
James P. Long
Committee Member
Ziyi Li
Committee Member
Scott Kopetz
Abstract
The aim of this study was to explore the potential of integrating multi-platform genomic datasets to improve our understanding of the biological mechanisms behind cancer. By merging clinical outcomes with the data obtained from multi-platform genomic studies, we can gain insight into the biological mechanisms behind a patient’s response to treatment. Additionally, the evaluation of the correlations between genetic variations and gene expression provides a better understanding of the functional significance of these variations. Such knowledge has the potential to revolutionize cancer diagnosis and treatment. This thesis describes methods developed to address two related aims. Aim 1: We have developed a unified mediation analysis approach to identify and quantify the intermediate mechanisms underlying gene or pathway effects on clinical outcomes. The proposed mediation analysis framework is based on the causal directed acyclic graph (DAG) structure. We developed a unified mediation analysis framework for multi-omics and clinical datasets in cancer, which contain multiple potential causes, multiple mediators, and categorical and survival responses that are not suitable for linear models in addition to continuous outcome variables. Using kidney renal clear cell carcinoma proteogenomic data, we identified genes that are mediated by proteins and the underlying mechanisms on various survival outcomes that capture short- and long-term disease-specific clinical characteristics. Aim 2: We have developed an allele-specific expression quantitative trait loci (eQTL) mapping framework accounting for impure tumor samples and copy number alterations. eQTL analysis is a powerful tool for understanding the genetic basis of complex traits. However, existing methods do not account for copy number alterations in tumor cells, resulting in increased false positive discoveries. We demonstrated that our proposed framework can control the type I error and increase the statistical power. When applied to The Cancer Genome Atlas (TCGA) colon cancer tumor samples, our framework identified more eGenes (genes with at least one significant eQTL) than the method that does not account for copy number alterations.
Keywords
Mediation Analysis; eQTL