Language
English
Publication Date
2-6-2025
Journal
American Journal of Human Genetics
DOI
10.1016/j.ajhg.2024.12.005
PMID
39753130
PMCID
PMC11866949
PubMedCentral® Posted Date
1-2-2025
PubMedCentral® Full Text Version
Post-print
Abstract
In recent years, significant efforts have been made to improve methods for genomic studies of admixed populations using local ancestry inference (LAI). Accurate LAI is crucial to ensure that downstream analyses accurately reflect the genetic ancestry of research participants. Here, we test analytic strategies for LAI to provide guidelines for optimal accuracy, focusing on admixed populations reflective of Latin America's primary continental ancestries-African (AFR), Amerindigenous (AMR), and European (EUR). Simulating linkage-disequilibrium-informed admixed haplotypes under a variety of 2- and 3-way admixture models, we implemented a standard LAI pipeline, testing the impact of reference panel composition, DNA data type, demography, and software parameters to quantify ancestry-specific LAI accuracy. We observe that across all models, AMR tracts have notably reduced LAI accuracy as compared to EUR and AFR tracts, with true positive rate means for AMR ranging from 88% to 94%, EUR from 96% to 99%, and AFR from 98% to 99%. When LAI miscalls occurred, they most frequently erroneously called EUR ancestry in true AMR sites. Concerning reference panel curation, we find that using a reference panel well matched to the target population, even with a smaller sample size, was accurate and the most computationally efficient. Imputation did not harm LAI performance in our tests; rather, we observed that higher variant density improved accuracy. While directly responsive to admixed Latin American cohort compositions, these trends are broadly useful for informing best practices for LAI across admixed populations. Our findings reinforce the need for the inclusion of more underrepresented populations in sequencing efforts to improve reference panels.
Keywords
Humans, Black People, Genetics, Population, Genome, Human, Haplotypes, Linkage Disequilibrium, Models, Genetic, Polymorphism, Single Nucleotide, Software, White People, Latin America, American Indian or Alaska Native, South American People, Central American People, ancestry, local ancestry inference, population genetics, genetic admixture, bioinformatics, reference panels
Published Open-Access
yes
Recommended Citation
Honorato-Mauer, Jessica; Shah, Nirav N; Maihofer, Adam X; et al., "Characterizing Features Affecting Local Ancestry Inference Performance in Admixed Populations" (2025). Faculty and Staff Publications. 5016.
https://digitalcommons.library.tmc.edu/baylor_docs/5016
Included in
Genetic Phenomena Commons, Genetic Processes Commons, Genetic Structures Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Specialties Commons