Author ORCID Identifier

0000-0002-5124-9338

Date of Graduation

5-2026

Document Type

Dissertation (PhD)

Program Affiliation

Quantitative Sciences

Degree Name

Doctor of Philosophy (PhD)

Advisor/Committee Chair

Zhongming Zhao

Committee Member

Peng Wei

Committee Member

Eva Sevick-Muraca

Committee Member

Claudio Soto

Committee Member

Wenbo Li

Abstract

The functional plasticity of the immune system is central to human health and disease, yet the high-resolution assessment of immune cell states remains a significant challenge. The advent of single-cell sequencing has revolutionized this research field by enabling the profiling of individual cellular transcriptomes. However, the resulting high-dimensional datasets are frequently characterized by extreme sparsity, stochastic noise, and technical batch effects that can obscure underlying biological signals. Navigating this complexity necessitates the development of sophisticated computational frameworks capable of modeling of single-cell data. This dissertation presents comprehensive bioinformatics frameworks for the characterization, representation, and prediction of immune cell states by integrating single-cell sequencing and advanced machine learning architectures.

The first pillar, characterization, is demonstrated by Scupa, which leverages single-cell foundation models for a unified assessment of cytokine-driven polarization. Scupa demonstrates that immune polarization exists on a conserved functional spectrum across diverse physiological and pathological contexts. The second pillar, representation, is addressed through FADVI, a variational autoencoder framework utilizing factorized disentanglement to isolate biological signals from technical batch effects. This provides a robust foundation for data integration. The translational utility of these methods is demonstrated in a study of regional immunotherapy delivery. By characterizing the immune landscape in response to vaccination and αCTLA4 delivery to non-tumor-draining lymph nodes, we identified specific T cell subpopulations as primary mediators of anti-tumor efficacy, highlighting the necessity of state-specific resolution. The third pillar, prediction, is exemplified by Turep, a deep-learning framework for cross-cancer tumor-reactive T cell identification. Turep identifies tumor-reactive clones in an antigen-agnostic manner and reveals active tumor-recognition regions within the tissue architecture when extended to spatial transcriptomics.

Collectively, these frameworks mark a pivotal transition from descriptive analysis toward inferential systems biology. They establish benchmarks for disentangled representation and functional scoring using generative modeling and foundation models. For the broader biology community, these methodologies provide high-fidelity tools for navigating immune plasticity, enabling researchers to move beyond static cell-type identities to identify the specific, fluid functional states that drive disease progression and therapeutic response.

Keywords

Machine learning, single-cell sequencing, immune cell state, cancer immunology

Available for download on Saturday, April 17, 2027

Share

COinS