Publication Date

9-1-2023

Journal

Nature

DOI

10.1038/s41586-023-06457-y

PMID

37612512

PMCID

PMC10752217

PubMedCentral® Posted Date

3-1-2024

PubMedCentral® Full Text Version

Post-print

Published Open-Access

yes

Keywords

Humans, Base Sequence, Chromosomes, Human, Y, DNA, Satellite, Genetic Variation, Genetics, Population, Genomics, Heterochromatin, Multigene Family, Reference Standards, Segmental Duplications, Genomic, Sequence Analysis, DNA, Tandem Repeat Sequences, Telomere

Abstract

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications13. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures of TSPY, DAZ, and RBMY gene families; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome4 and mapped available population variation, clinical variants, and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.