Faculty, Staff and Student Publications

Publication Date

10-1-2022

Journal

Machine Learning

DOI

10.1007/s10994-022-06174-z

PMID

40766896

PMCID

PMC12323809

PubMedCentral® Posted Date

8-5-2025

PubMedCentral® Full Text Version

Author MSS

Abstract

Although there has been an explosive rise in network data in a variety of disciplines, there is very limited development of regression modeling approaches based on high-dimensional networks. The scarce literature in this area typically assume linear relationships between the outcome and the high-dimensional network edges that results in an inflated model plagued by the curse of dimensionality and these models are unable to accommodate non-linear relationships or higher order interactions. In order to overcome these limitations, we develop a novel two-stage Bayesian non-parametric regression modeling framework using high-dimensional networks as covariates, which first finds a lower dimensional node-specific representation for the networks, and then embeds these representations in a flexible Gaussian process regression framework along with supplemental covariates for modeling the continuous outcome variable. Moving from edge-level analysis to node-level model allows us to scale up to high-dimensional networks, and enables node selection via an extension of the Gaussian process framework that involves spike-and-slab priors on the lengthscale parameters. Extensive simulations show a distinct advantage of the proposed approach in terms of prediction, coverage, and node selection. The proposed model achieves considerable gains when predicting posttraumatic stress disorder (PTSD) resilience based on brain networks in our motivating neuroimaging applications, and also identifies important brain regions associated with PTSD. In contrast, existing non-linear approaches that employ the full-edge set or those that use other dimension reduction techniques on the network are not equipped for node selection and results in poor prediction and characterization of predictive uncertainty, while linear approaches using the edge-level features are overly inflated and typically result in poor performance.

Keywords

Dimension reduction, Gaussian process regression, Latent scale network models, Manifold, Posttraumatic stress disorder

Published Open-Access

yes

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.