Publication Date
1-8-2025
Journal
Nature Communications
DOI
10.1038/s41467-024-55710-z
PMID
39779690
PMCID
PMC11711550
PubMedCentral® Posted Date
1-8-2025
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Keywords
Humans, Chromosomes, Human, Y, Chromosomes, Human, X, Male, Benchmarking, DNA Copy Number Variations, Genome, Human, Genetic Variation, Genomics, Genome informatics, Genomics, DNA sequencing
Abstract
The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research and clinical laboratories to evaluate variant detection on male sex chromosomes X and Y, we create a small variant benchmark set with 111,725 variants for the Genome in a Bottle HG002 reference material. We develop an active evaluation approach to demonstrate the benchmark set reliably identifies errors in challenging genomic regions and across short and long read callsets. We show how complete assemblies can expand benchmarks to difficult regions, but highlight remaining challenges benchmarking variants in long homopolymers and tandem repeats, complex gene conversions, copy number variable gene arrays, and human satellites.
Included in
Biological Phenomena, Cell Phenomena, and Immunity Commons, Biomedical Informatics Commons, Genetics and Genomics Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Specialties Commons