Author ORCID Identifier
0000-0002-0189-219X
Date of Graduation
12-2018
Document Type
Thesis (MS)
Program Affiliation
Biomedical Sciences
Degree Name
Masters of Science (MS)
Advisor/Committee Chair
Xiaoming Liu
Committee Member
Yunxin Fu
Committee Member
Peng Wei
Committee Member
Degui Zhi
Committee Member
Myriam Fornage
Abstract
IMPROVING dbNSFP
Mingyao Lu, B.S.
Advisory Professor: Xiaoming Liu, Ph.D.
The analysis and interpretation of DNA variation are very important for the Whole Exome studies (WES). Genome research has focused on single nucleotide variants (SNVs). Since indels are as important as SNVs, especially indels in coding regions are often candidates of disease-causing variants, thus, it is necessary to expand the focus to include indel mutations.
The goal of my project is to provide an automatic annotation pipeline to the WES based disease studies project by extending the dbNSFP with a tool for automated indel annotation and deleteriousness prediction. The current sequencing results typically include both SNVs and indels. Although there have been many available tools to integrate functional prediction/annotations for SNV effects, there are no such tools for indels to my knowledge. Therefore, the aim of this thesis was to add deleteriousness prediction scores to indel annotation based on gene models, including CADD, SIFT, and PROVEAN. All those scores can be calculated on-the-fly after installing resources locally. A Docker implementing the indel annotation and deleteriousness prediction has been developed and ready to be deployed from the cloud.
Keywords
Indels Annotation, Functional Annotation, Whole Exome Sequencing, Deleterious Prediction Scores