Faculty, Staff and Student Publications

Evaluation of Vicinity-Based Hidden Markov Models For Genotype Imputation

Language

English

Publication Date

8-29-2022

Journal

BMC Bioinformatics

DOI

10.1186/s12859-022-04896-4

PMID

36038834

PMCID

PMC9422108

PubMedCentral® Posted Date

August 2022

PubMedCentral® Full Text Version

Post-print

Abstract

BACKGROUND: The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel.

RESULTS: Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants.

CONCLUSIONS: Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer.

Keywords

Genotype imputation, Hidden Markov models, Forward–Backward algorithm, Viterbi algorithm

Published Open-Access

yes

Recommended Citation

Wang, Su; Kim, Miran; Jiang, Xiaoqian; et al., "Evaluation of Vicinity-Based Hidden Markov Models For Genotype Imputation" (2022). Faculty, Staff and Student Publications. 199.
https://digitalcommons.library.tmc.edu/uthshis_docs/199

Download

Included in

Bioinformatics Commons, Biomedical Informatics Commons, Data Science Commons, Genomics Commons, Medical Genetics Commons

COinS

Faculty, Staff and Student Publications

Evaluation of Vicinity-Based Hidden Markov Models For Genotype Imputation

Language

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Abstract

Keywords

Published Open-Access

Recommended Citation

Included in

Search

Browse

Author Corner

More Info

Library

Faculty, Staff and Student Publications

Evaluation of Vicinity-Based Hidden Markov Models For Genotype Imputation

Authors

Language

Publication Date

Journal

DOI

PMID

PMCID

PubMedCentral® Posted Date

PubMedCentral® Full Text Version

Abstract

Keywords

Published Open-Access

Recommended Citation

Included in

Share

Search

Browse

Author Corner

More Info

Library