Dissertations & Theses (Open Access)

Development of Graphical Models and Statistical Physics Motivated Approaches to Genomic Investigations

Yashwanth LagisettyFollow

Author ORCID Identifier

0000-0001-7364-4630

Date of Graduation

8-2022

Document Type

Dissertation (PhD)

Program Affiliation

Biostatistics, Bioinformatics and Systems Biology

Degree Name

Doctor of Philosophy (PhD)

Advisor/Committee Chair

Edgar T. Walters - Advisory Professor

Committee Member

Olivier Lichtarge - Advisory Professor

Committee Member

Prahlad Ram

Committee Member

Anil Korkut

Committee Member

Marsal Sanches

Abstract

Identifying genes involved in disease pathology has been a goal of genomic research since the early days of the field. However, as technology improves and the body of research grows, we are faced with more questions than answers. Among these is the pressing matter of our incomplete understanding of the genetic underpinnings of complex diseases. Many hypotheses offer explanations as to why direct and independent analyses of variants, as done in genome-wide association studies (GWAS), may not fully elucidate disease genetics. These range from pointing out flaws in statistical testing to invoking the complex dynamics of epigenetic processes. In the studies outlined here, however, we focus on the hypothesis that interactions between genes may be a potential culprit. To probe this hypothesis, we begin by developing an algorithm, GeneEMBED, to model the total effect of protein coding variants in various genes across a molecular network of genetic interactions. Given a population of disease and healthy individuals, GeneEMBED systematically evaluates the relative contribution of a gene to disease. The associations are quantified by examining the patterns of differential perturbations in the gene's interactions throughout a biological network. As a proof-of-concept, we applied GeneEMBED to two late-onset Alzheimer's disease (AD) cohorts of 5,169 exomes and 969 genomes. We identified 143 candidate disease-associated genes across the two cohorts and three biological networks. These candidate genes were differentially expressed in both bulk and single-cell RNA expression data from post-mortem AD brains. Knockouts of these candidates in mice were known to lead to abnormal neurological phenotypes. Lastly, in vivo drosophila assays of candidates showed they modified neurodegenerative phenotypes. Next, we focus on the discrepancies between the functional impact of mutations across different genes. While tools to predict the degree of functional impact a given coding mutation will have on the encoded protein are widely successful, they often make predictions relative to the given gene. To this effect, we extend principles of statistical mechanics to biology to measure any given gene's relative mutational intolerance. Importantly, these mutational intolerance scores can distinguish essential genes from non-essential genes in E.coli. In humans, they can segregate genes that cause autosomal dominant Mendelian diseases from non-disease genes. Similarly, highly mutationally intolerant genes were enriched in core and conserved biological processes across three different species. Conversely, mutationally tolerant genes were involved in adaptive processes, again across three different species. Most notably, we found that mutational intolerance scores highly correlated with experimentally measured fitness effects of gene knockdowns. Together, these efforts provide new tools with which to investigate disease-gene associations and provide insights into the biological dynamics of gene networks.

Recommended Citation

Lagisetty, Yashwanth, "Development of Graphical Models and Statistical Physics Motivated Approaches to Genomic Investigations" (2022). Dissertations & Theses (Open Access). 1205.
https://digitalcommons.library.tmc.edu/utgsbs_dissertations/1205

Keywords

Machine Learning, Statistical Mechanics, Thermodynamics, Genomics, Evolution, Alzheimer's Disease

Download

Included in

Artificial Intelligence and Robotics Commons, Computational Biology Commons, Evolution Commons, Genomics Commons, Geometry and Topology Commons, Nervous System Diseases Commons, Statistical, Nonlinear, and Soft Matter Physics Commons

COinS

Dissertations & Theses (Open Access)

Development of Graphical Models and Statistical Physics Motivated Approaches to Genomic Investigations

Author ORCID Identifier

Date of Graduation

Document Type

Program Affiliation

Degree Name

Advisor/Committee Chair

Committee Member

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Keywords

Included in

Search

Browse

Author Corner

More Info

Library

Dissertations & Theses (Open Access)

Development of Graphical Models and Statistical Physics Motivated Approaches to Genomic Investigations

Author

Author ORCID Identifier

Date of Graduation

Document Type

Program Affiliation

Degree Name

Advisor/Committee Chair

Committee Member

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Keywords

Included in

Share

Search

Browse

Author Corner

More Info

Library