Date of Graduation

5-2013

Document Type

Dissertation (PhD)

Program Affiliation

Biomathematics and Biostatistics

Degree Name

Doctor of Philosophy (PhD)

Advisor/Committee Chair

Shoudan Liang

Committee Member

Peter Mueller

Committee Member

Yuan Ji

Committee Member

Qing Ma

Committee Member

Marcos R. Estecio

Abstract

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. .

To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved.

SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Keywords

next-generation sequencing, sequencing error, error correction, SNP detection

Share

COinS