Estimating the mean and standard deviation of concentration distributions using censored samples

John Allen Mikus, The University of Texas School of Public Health


Environmental data sets of pollutant concentrations in air, water, and soil frequently include unquantified sample values reported only as being below the analytical method detection limit. These values, referred to as censored values, should be considered in the estimation of distribution parameters as each represents some value of pollutant concentration between zero and the detection limit. Most of the currently accepted methods for estimating the population parameters of environmental data sets containing censored values rely upon the assumption of an underlying normal (or transformed normal) distribution. This assumption can result in unacceptable levels of error in parameter estimation due to the unbounded left tail of the normal distribution. With the beta distribution, which is bounded by the same range of a distribution of concentrations, $\rm\lbrack0\le x\le1\rbrack,$ parameter estimation errors resulting from improper distribution bounds are avoided. This work developed a method that uses the beta distribution to estimate population parameters from censored environmental data sets and evaluated its performance in comparison to currently accepted methods that rely upon an underlying normal (or transformed normal) distribution. Data sets were generated assuming typical values encountered in environmental pollutant evaluation for mean, standard deviation, and number of variates. For each set of model values, data sets were generated assuming that the data was distributed either normally, lognormally, or according to a beta distribution. For varying levels of censoring, two established methods of parameter estimation, regression on normal ordered statistics, and regression on lognormal ordered statistics, were used to estimate the known mean and standard deviation of each data set. The method developed for this study, employing a beta distribution assumption, was also used to estimate parameters and the relative accuracy of all three methods were compared. For data sets of all three distribution types, and for censoring levels up to 50%, the performance of the new method equaled, if not exceeded, the performance of the two established methods. Because of its robustness in parameter estimation regardless of distribution type or censoring level, the method employing the beta distribution should be considered for full development in estimating parameters for censored environmental data sets.

Subject Area

Statistics|Environmental science

Recommended Citation

Mikus, John Allen, "Estimating the mean and standard deviation of concentration distributions using censored samples" (1997). Texas Medical Center Dissertations (via ProQuest). AAI9809549.