Publication Date
10-1-2024
Journal
Nature Biotechnology
DOI
10.1038/s41587-023-02024-y
PMID
38168980
PMCID
PMC11217151
PubMedCentral® Posted Date
1-2-2024
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Keywords
Humans, Mosaicism, High-Throughput Nucleotide Sequencing, Genomic Structural Variation, Software, Sequence Analysis, DNA, Genome informatics, Genetics, Cancer, Software
Abstract
Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.
Included in
Biological Phenomena, Cell Phenomena, and Immunity Commons, Biomedical Informatics Commons, Genetics and Genomics Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Specialties Commons
Comments
This article has been corrected. See Nat Biotechnol. 2024 Jan 22;42(10):1616.
Associated Data