Faculty, Staff and Student Publications

Language

English

Publication Date

8-31-2025

Journal

Briefings in Bioinformatics

DOI

10.1093/bib/bbaf565

PMID

41148224

PMCID

PMC12560794

PubMedCentral® Posted Date

10-28-2025

PubMedCentral® Full Text Version

Post-print

Abstract

Polygenic risk scores (PRS) are widely used to assess genetic susceptibility in Alzheimer's disease (AD) research. However, the rapid expansion of PRS studies has led to dataset-specific biases-stemming from factors like population makeup, genotyping methods, and analysis pipelines-that result in inconsistent variant prioritization and limit generalizability and reproducibility. To address these challenges, we propose a transductive learning framework that integrates multiple PRS datasets for more robust risk variant prioritization, incorporating genome-wide association study (GWAS) priority scores as biologically informed priors. Additionally, we introduce BrainGeneBot, an AI-driven tool leveraging generative pretrained transformers with retrieval-augmented generation technology to streamline genomic analyses in AD, including the STRING for protein interaction analysis, Enrichr for gene set enrichment, ClinVar for genetic variant interpretation, and Biopython for conducting literature searches. We apply our approach to publicly available AD datasets from the PGS Catalog and conduct further analyses to validate its efficacy. In parallel, we perform conventional unsupervised rank aggregation as a baseline. The transductive learning approach not only verifies high-risk variants identified by traditional methods but also reveals unique insights that better correlate with GWAS signals. Our framework streamlines data retrieval and interpretation, effectively prioritizing genetic variants in multiple PRS studies. Moreover, BrainGeneBot facilitates the discovery of biologically meaningful insights to enhance PRS interpretability and applicability in AD research, supporting the development of precise AD interventions and treatments. Our approach provides a robust framework for AD genetic research, improving data accessibility, accelerating discoveries, and refining genetic insights.

Keywords

Humans, Multifactorial Inheritance, Genome-Wide Association Study, Alzheimer Disease, Genetic Predisposition to Disease, Software, Polymorphism, Single Nucleotide, Genetic Risk Score, genomic analysis, GPT-powered informatics, polygenic score, rank aggregation, transductive learning

Published Open-Access

yes

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.