Author ORCID Identifier

0000-0002-0189-219X

Date of Graduation

12-2018

Document Type

Thesis (MS)

Program Affiliation

Biomedical Sciences

Degree Name

Masters of Science (MS)

Advisor/Committee Chair

Xiaoming Liu

Committee Member

Yunxin Fu

Committee Member

Peng Wei

Committee Member

Degui Zhi

Committee Member

Myriam Fornage

Abstract

IMPROVING dbNSFP

Mingyao Lu, B.S.

Advisory Professor: Xiaoming Liu, Ph.D.

The analysis and interpretation of DNA variation are very important for the Whole Exome studies (WES). Genome research has focused on single nucleotide variants (SNVs). Since indels are as important as SNVs, especially indels in coding regions are often candidates of disease-causing variants, thus, it is necessary to expand the focus to include indel mutations.

The goal of my project is to provide an automatic annotation pipeline to the WES based disease studies project by extending the dbNSFP with a tool for automated indel annotation and deleteriousness prediction. The current sequencing results typically include both SNVs and indels. Although there have been many available tools to integrate functional prediction/annotations for SNV effects, there are no such tools for indels to my knowledge. Therefore, the aim of this thesis was to add deleteriousness prediction scores to indel annotation based on gene models, including CADD, SIFT, and PROVEAN. All those scores can be calculated on-the-fly after installing resources locally. A Docker implementing the indel annotation and deleteriousness prediction has been developed and ready to be deployed from the cloud.

Keywords

Indels Annotation, Functional Annotation, Whole Exome Sequencing, Deleterious Prediction Scores

Available for download on Monday, June 10, 2019

Share

COinS