Publication Date
1-2-2025
Journal
Nature Communications
DOI
10.1038/s41467-024-55066-4
PMID
39746940
PMCID
PMC11696468
PubMedCentral® Posted Date
1-2-2025
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Keywords
Humans, Mutation, Missense, Computational Biology, Gene Frequency, ROC Curve, Polymorphism, Single Nucleotide, Databases, Genetic, Software
Abstract
Computational methods for estimating missense variant impact suffer from inconsistent performance across genes, which poses a major challenge for their reliable use in clinical practice. While ensemble scores leverage multiple prediction methods to enhance consistency, the overrepresentation of certain genes in the training data can bias their outcomes. To address this critical limitation, we propose a gene-specific ensemble framework trained on reference computational annotations rather than on clinical or experimental data. Accordingly, we generate Meta-EA ensemble scores that achieve comparable performance to the top individual predicting method for each gene set. Incorporating the effects of splicing and the allele frequency of human polymorphisms further enhances the performance of Meta-EA, achieving an area under the receiver operating characteristic curve of 0.97 for both gene-balanced and imbalanced clinical assessments. In conclusion, this work leverages the wealth of existing variant impact prediction approaches to generate improved estimations for clinical interpretation.
Included in
Biological Phenomena, Cell Phenomena, and Immunity Commons, Biomedical Informatics Commons, Genetics and Genomics Commons, Medical Genetics Commons, Medical Molecular Biology Commons, Medical Specialties Commons
Comments
Associated Data