Faculty, Staff and Student Publications
Language
English
Publication Date
8-31-2025
Journal
Briefings in Bioinformatics
DOI
10.1093/bib/bbaf565
PMID
41148224
PMCID
PMC12560794
PubMedCentral® Posted Date
10-28-2025
PubMedCentral® Full Text Version
Post-print
Abstract
Polygenic risk scores (PRS) are widely used to assess genetic susceptibility in Alzheimer's disease (AD) research. However, the rapid expansion of PRS studies has led to dataset-specific biases-stemming from factors like population makeup, genotyping methods, and analysis pipelines-that result in inconsistent variant prioritization and limit generalizability and reproducibility. To address these challenges, we propose a transductive learning framework that integrates multiple PRS datasets for more robust risk variant prioritization, incorporating genome-wide association study (GWAS) priority scores as biologically informed priors. Additionally, we introduce BrainGeneBot, an AI-driven tool leveraging generative pretrained transformers with retrieval-augmented generation technology to streamline genomic analyses in AD, including the STRING for protein interaction analysis, Enrichr for gene set enrichment, ClinVar for genetic variant interpretation, and Biopython for conducting literature searches. We apply our approach to publicly available AD datasets from the PGS Catalog and conduct further analyses to validate its efficacy. In parallel, we perform conventional unsupervised rank aggregation as a baseline. The transductive learning approach not only verifies high-risk variants identified by traditional methods but also reveals unique insights that better correlate with GWAS signals. Our framework streamlines data retrieval and interpretation, effectively prioritizing genetic variants in multiple PRS studies. Moreover, BrainGeneBot facilitates the discovery of biologically meaningful insights to enhance PRS interpretability and applicability in AD research, supporting the development of precise AD interventions and treatments. Our approach provides a robust framework for AD genetic research, improving data accessibility, accelerating discoveries, and refining genetic insights.
Keywords
Humans, Multifactorial Inheritance, Genome-Wide Association Study, Alzheimer Disease, Genetic Predisposition to Disease, Software, Polymorphism, Single Nucleotide, Genetic Risk Score, genomic analysis, GPT-powered informatics, polygenic score, rank aggregation, transductive learning
Published Open-Access
yes
Recommended Citation
Qu, Gang; Enduru, Nitesh; Liu, Xinyi; et al., "BrainGeneBot: A Framework for Variant Prioritization and Generative Pretrained Transformer-Informed Interpretation Across Polygenic Risk Score Studies" (2025). Faculty, Staff and Student Publications. 731.
https://digitalcommons.library.tmc.edu/uthshis_docs/731