Influence Of Highly Correlated Cross-Basis Functions In The Distributed Lag Non-Linear Model

Yunqi Liao, The University of Texas School of Public Health

Abstract

The distributed lag non-linear model (DLNM) is frequently used in environmental and epidemiological time-series studies to explore delayed effects on mortality and morbidity. The DLNM represents delayed non-linear exposure-outcome relationships through cross-basis functions. Although investigators have used multiple cross-basis functions in a single DLNM, the correlation between those cross-basis functions has not been evaluated and the effect of using multiple cross-basis functions in a DLNM is unknown. This study determined whether two highly correlated variables would produce two highly correlated cross-basis functions, and examined the effect of using two highly correlated cross-basis functions on DLNM predictions through a simulation analysis. A simulation analysis of 500 data sets, each with 1,000 observations of X1, X2, and Y, was conducted. RV-coefficients between the cross-basis matrices of X1 and X 2 were calculated to quantify the correlation between the cross-basis functions of X1 and X2. Six quasi-Poisson DLNMs with two cross-basis functions were fitted for each set. The lag period for X 1 was set to 10 for all six models, while the lag period for X 2 varied from 10 to 15. The predicted estimates of the cross-basis function of X1 from the six DLNMs were evaluated for bias and precision against the pre-specified true values of the cross-basis function of X 1 through averaged mean squared errors (MSEs) and averaged standard errors. A time-series data of weekly Zika virus infection counts and meteorological measurements in Colombia was used as a case study. The averaged RV-coefficients between the cross-basis matrices of X 1 and X2 were all greater than 0.98 regardless of the correlation between X1 and X2. The largest averaged MSE s and the largest averaged standard errors of X1 estimates both occurred in the model where the lag periods for X1 and X 2 were the same. Our findings indicate that two cross-basis functions became highly correlated when they had the same length of lag. Additionally, using two highly correlated cross-basis functions in a single DLNM resulted in more biased and less precise estimates.

Subject Area

Biostatistics

Recommended Citation

Liao, Yunqi, "Influence Of Highly Correlated Cross-Basis Functions In The Distributed Lag Non-Linear Model" (2017). Texas Medical Center Dissertations (via ProQuest). AAI10606867.
https://digitalcommons.library.tmc.edu/dissertations/AAI10606867

Share

COinS