
Faculty, Staff and Student Publications
Publication Date
3-4-2022
Journal
Bioinformatics
Abstract
Motivation: RNA-sequencing (RNA-seq) of tumor tissue is typically only used to measure gene expression. Here, we present a statistical approach that leverages existing RNA-seq data to also detect somatic copy number alterations (SCNAs), a pervasive phenomenon in human cancers, without a need to sequence the corresponding DNA.
Results: We present an analysis of 4942 participant samples from 28 cancers in The Cancer Genome Atlas (TCGA), demonstrating robust detection of SCNAs from RNA-seq. Using genotype imputation and haplotype information, our RNA-based method had a median sensitivity of 85% to detect SCNAs defined by DNA analysis, at high specificity (∼95%). As an example of translational potential, we successfully replicated SCNA features associated with breast cancer subtypes. Our results credential haplotype-based inference based on RNA-seq to detect SCNAs in clinical and population-based settings.
Availability and implementation: The analyses presented use the data publicly available from TCGA Research Network (http://cancergenome.nih.gov/). See Methods for details regarding data downloads. hapLOHseq software is freely available under The MIT license and can be downloaded from http://scheet.org/software.html.
Supplementary information: Supplementary data are available at Bioinformatics online.
Keywords
Humans, Female, Software, Breast Neoplasms, Genome, Exome Sequencing, RNA, Sequence Analysis, RNA
DOI
10.1093/bioinformatics/btab861
PMID
34999743
PMCID
PMC8896613
PubMedCentral® Posted Date
1-6-2022
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Included in
Bioinformatics Commons, Biomedical Informatics Commons, Genetic Phenomena Commons, Medical Genetics Commons, Oncology Commons