Author ORCID Identifier
0000-0002-5124-9338
Date of Graduation
5-2026
Document Type
Dissertation (PhD)
Program Affiliation
Quantitative Sciences
Degree Name
Doctor of Philosophy (PhD)
Advisor/Committee Chair
Zhongming Zhao
Committee Member
Peng Wei
Committee Member
Eva Sevick-Muraca
Committee Member
Claudio Soto
Committee Member
Wenbo Li
Abstract
The functional plasticity of the immune system is central to human health and disease, yet the high-resolution assessment of immune cell states remains a significant challenge. The advent of single-cell sequencing has revolutionized this research field by enabling the profiling of individual cellular transcriptomes. However, the resulting high-dimensional datasets are frequently characterized by extreme sparsity, stochastic noise, and technical batch effects that can obscure underlying biological signals. Navigating this complexity necessitates the development of sophisticated computational frameworks capable of modeling of single-cell data. This dissertation presents comprehensive bioinformatics frameworks for the characterization, representation, and prediction of immune cell states by integrating single-cell sequencing and advanced machine learning architectures.
The first pillar, characterization, is demonstrated by Scupa, which leverages single-cell foundation models for a unified assessment of cytokine-driven polarization. Scupa demonstrates that immune polarization exists on a conserved functional spectrum across diverse physiological and pathological contexts. The second pillar, representation, is addressed through FADVI, a variational autoencoder framework utilizing factorized disentanglement to isolate biological signals from technical batch effects. This provides a robust foundation for data integration. The translational utility of these methods is demonstrated in a study of regional immunotherapy delivery. By characterizing the immune landscape in response to vaccination and αCTLA4 delivery to non-tumor-draining lymph nodes, we identified specific T cell subpopulations as primary mediators of anti-tumor efficacy, highlighting the necessity of state-specific resolution. The third pillar, prediction, is exemplified by Turep, a deep-learning framework for cross-cancer tumor-reactive T cell identification. Turep identifies tumor-reactive clones in an antigen-agnostic manner and reveals active tumor-recognition regions within the tissue architecture when extended to spatial transcriptomics.
Collectively, these frameworks mark a pivotal transition from descriptive analysis toward inferential systems biology. They establish benchmarks for disentangled representation and functional scoring using generative modeling and foundation models. For the broader biology community, these methodologies provide high-fidelity tools for navigating immune plasticity, enabling researchers to move beyond static cell-type identities to identify the specific, fluid functional states that drive disease progression and therapeutic response.
Recommended Citation
Liu, Wendao, "Characterization, representation, and prediction of immune cell states using single-cell sequencing and machine learning" (2026). Dissertations & Theses (Open Access). 1512.
https://digitalcommons.library.tmc.edu/utgsbs_dissertations/1512
Keywords
Machine learning, single-cell sequencing, immune cell state, cancer immunology
Included in
Bioinformatics Commons, Cancer Biology Commons, Cell Biology Commons, Computational Biology Commons, Immunity Commons, Systems Biology Commons