Date of Graduation

8-2023

Document Type

Dissertation (PhD)

Program Affiliation

Quantitative Sciences

Degree Name

Doctor of Philosophy (PhD)

Advisor/Committee Chair

Christine B. Peterson

Committee Member

Peng Wei

Committee Member

James P. Long

Committee Member

Ziyi Li

Committee Member

Scott Kopetz

Abstract

The aim of this study was to explore the potential of integrating multi-platform genomic datasets to improve our understanding of the biological mechanisms behind cancer. By merging clinical outcomes with the data obtained from multi-platform genomic studies, we can gain insight into the biological mechanisms behind a patient’s response to treatment. Additionally, the evaluation of the correlations between genetic variations and gene expression provides a better understanding of the functional significance of these variations. Such knowledge has the potential to revolutionize cancer diagnosis and treatment. This thesis describes methods developed to address two related aims. Aim 1: We have developed a unified mediation analysis approach to identify and quantify the intermediate mechanisms underlying gene or pathway effects on clinical outcomes. The proposed mediation analysis framework is based on the causal directed acyclic graph (DAG) structure. We developed a unified mediation analysis framework for multi-omics and clinical datasets in cancer, which contain multiple potential causes, multiple mediators, and categorical and survival responses that are not suitable for linear models in addition to continuous outcome variables. Using kidney renal clear cell carcinoma proteogenomic data, we identified genes that are mediated by proteins and the underlying mechanisms on various survival outcomes that capture short- and long-term disease-specific clinical characteristics. Aim 2: We have developed an allele-specific expression quantitative trait loci (eQTL) mapping framework accounting for impure tumor samples and copy number alterations. eQTL analysis is a powerful tool for understanding the genetic basis of complex traits. However, existing methods do not account for copy number alterations in tumor cells, resulting in increased false positive discoveries. We demonstrated that our proposed framework can control the type I error and increase the statistical power. When applied to The Cancer Genome Atlas (TCGA) colon cancer tumor samples, our framework identified more eGenes (genes with at least one significant eQTL) than the method that does not account for copy number alterations.

Keywords

Mediation Analysis; eQTL

Included in

Biostatistics Commons

Share

COinS