Publication Date
9-11-2023
Journal
Genome Biology
DOI
10.1186/s13059-023-03022-8
PMID
37697406
PMCID
PMC10496407
PubMedCentral® Posted Date
9-11-2023
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Keywords
Humans, Animals, Hominidae, Segmental Duplications, Genomic, Telomere, Genomics, Chromosomes, Human, De-novo assembly, Segmental duplications, Long-read PacBio sequencing, Chromosomal fusion, Complex genomic rearrangements
Abstract
Resolving complex genomic regions rich in segmental duplications (SDs) is challenging due to the high error rate of long-read sequencing. Here, we describe a targeted approach with a novel genome assembler PhaseDancer that extends SD-rich regions of interest iteratively. We validate its robustness and efficiency using a golden-standard set of human BAC clones and in silico-generated SDs with predefined evolutionary scenarios. PhaseDancer enables extension of the incomplete complex SD-rich subtelomeric regions of Great Ape chromosomes orthologous to the human chromosome 2 (HSA2) fusion site, informing a model of HSA2 formation and unravelling the evolution of human and Great Ape genomes.
Included in
Biological Phenomena, Cell Phenomena, and Immunity Commons, Biomedical Informatics Commons, Genetics and Genomics Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Specialties Commons