
Faculty, Staff and Student Publications
Publication Date
2-14-2025
Journal
Genome Research
Abstract
One of the major challenges in genomic data sharing is protecting participants' privacy in collaborative studies and in cases when genomic data are outsourced to perform analysis tasks, for example, genotype imputation services and federated collaborations genomic analysis. Although numerous cryptographic methods have been developed, these methods may not yet be practical for population-scale tasks in terms of computational requirements, rely on high-level expertise in security, and require each algorithm to be implemented from scratch. In this study, we focus on outsourcing of genotype imputation, a fundamental task that utilizes population-level reference panels, and develop protocols that rely on using "proxy panels" to protect genotype panels, whereas the imputation task is being outsourced at servers. The proxy panels are generated through a series of protection mechanisms such as haplotype sampling, allele hashing, and coordinate anonymization to protect the underlying sensitive panel's genetic variant coordinates, genetic maps, and chromosome-wide haplotypes. Although the resulting proxy panels are almost distinct from the sensitive panels, they are valid panels that can be used as input to imputation methods such as Beagle. We demonstrate that proxy-based imputation protects against well-known attacks with a minor decrease in imputation accuracy for variants in a wide range of allele frequencies.
Keywords
Humans, Outsourced Services, Genotype, Algorithms, Genetic Privacy, Haplotypes, Genomics, Polymorphism, Single Nucleotide, Information Dissemination
DOI
10.1101/gr.278934.124
PMID
39794122
PMCID
PMC11874966
PubMedCentral® Posted Date
2-1-2025
PubMedCentral® Full Text Version
Post-print
Published Open-Access
yes
Included in
Bioinformatics Commons, Biomedical Informatics Commons, Data Science Commons, Genomics Commons, Medical Genetics Commons