Date of Award

Spring 5-2020

Degree Name

Doctor of Philosophy (PhD)



Second Advisor


Third Advisor



It is often of interest to measure the distribution (i.e. mean and percentiles) of count outcomes from national survey data to assess population consumption and guide public health efforts for substances such as alcohol, cigarettes, marijuana, and other illicit or licit drugs. Currently available methods for estimating the distribution of dietary intakes do not immediately lend themselves to estimating the consumption of substances measured as counts, nor do they accommodate the complex design elements – strata, cluster, and weight – characteristic of national surveys. We introduce an accurate methodology, called the Survey Adjusted Count (SAC) method, and an associated SAS macro for estimating population-level distribution statistics (means, percentiles, and standard errors) for cross-sectional and longitudinal count data that arise from complex national surveys. First, a negative binomial hurdle is used to estimate the product of the probability of consuming in a given time period (e.g. day or year) and the amount consumed in the time period, over the number of longitudinal observations available in the study (two or more, depending on the study). These parts are then linked, thus allowing for correlation between the two model parts. Using these model-based parameter estimates, the distribution of consumption is then simulated to calculate population mean consumption and percentiles. Standard errors are then estimated using Balanced Repeated Replication method which accounts for stratification, clustering, and weighting. We validated the methodology via a simulation study comparing its performance versus currently available methods, and illustrated the utility of the method by estimating alcohol intake from a cross-sectional and a longitudinal survey – the National Health and Nutrition Examination Survey and the National Longitudinal Survey of Youth, respectively. Application of the method to these data allowed us to estimate mean (cross-sectional) and lifetime mean (longitudinal) consumption of alcohol, as well as percentiles and standard errors. Furthermore, using the SAS macro we provide, these distributions can be estimated by subgroups/demographics of interest. Therefore, utilizing the SAC method presented, we can attain accurate estimates of short-term and/or lifetime consumption for counts by subgroups of interest and facilitate accurate public health recommendations.