Author ORCID Identifier


Date of Graduation


Document Type

Thesis (MS)

Program Affiliation

Biomedical Sciences

Degree Name

Masters of Science (MS)

Advisor/Committee Chair

Xiaoming Liu

Committee Member

Yunxin Fu

Committee Member

Peng Wei

Committee Member

Degui Zhi

Committee Member

Myriam Fornage



Mingyao Lu, B.S.

Advisory Professor: Xiaoming Liu, Ph.D.

The analysis and interpretation of DNA variation are very important for the Whole Exome studies (WES). Genome research has focused on single nucleotide variants (SNVs). Since indels are as important as SNVs, especially indels in coding regions are often candidates of disease-causing variants, thus, it is necessary to expand the focus to include indel mutations.

The goal of my project is to provide an automatic annotation pipeline to the WES based disease studies project by extending the dbNSFP with a tool for automated indel annotation and deleteriousness prediction. The current sequencing results typically include both SNVs and indels. Although there have been many available tools to integrate functional prediction/annotations for SNV effects, there are no such tools for indels to my knowledge. Therefore, the aim of this thesis was to add deleteriousness prediction scores to indel annotation based on gene models, including CADD, SIFT, and PROVEAN. All those scores can be calculated on-the-fly after installing resources locally. A Docker implementing the indel annotation and deleteriousness prediction has been developed and ready to be deployed from the cloud.


Indels Annotation, Functional Annotation, Whole Exome Sequencing, Deleterious Prediction Scores



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.