Covariance-regularizaed macrolevel discriminant analysis with application to cervical cancer screening

Xinshuo Wu, The University of Texas School of Public Health

Abstract

We here focused on the classification problem with nested, hierarchical data structure, which is classifying a macro-level object based on measurements of embedded micro-level observations within each object. This type of classification problem typically arises in quantitative cytology, where patients are classified into varying statuses of disease based on measurements on a random collection of their cells. Macrolevel discriminant analysis (MDA) has recently been developed to classify macro-level object while taking into account the correlations between micro-level observations. However, regular MDA would face significant challenges dealing with scenarios of high dimensionality. We here proposed to apply covariance regularization techniques within the statistical framework of MDA to improve its methodology. Result from the simulation study revealed varying degrees of advantage of covariance regularized MDA methods over the regular MDA under circumstances when the underlying covariance matrix was sparse. The magnitude of improvement in performance seemed to be positively associated with the size of the dimensionality of features. In the non-sparse case, however, covariance regularized MDA methods were not as good as the regular MDA. We applied proposed methods to a cervical cancer dataset of 1728 patients, with a random sample of 100 cells per patient and 112 features measured per cell. The Principal Orthogonal ComplEment Thresholding (POET) method achieved the highest overall accuracy at 85.22% and highest area under the ROC curve (AUC) at 0.7524. The shrinkage method with identity matrix as target showed the best clinically-relevant result, with estimated 42.70% sensitivity while controlling the specificity at 90.00%.^

Subject Area

Biostatistics|Statistics

Recommended Citation

Wu, Xinshuo, "Covariance-regularizaed macrolevel discriminant analysis with application to cervical cancer screening" (2015). Texas Medical Center Dissertations (via ProQuest). AAI1602765.
http://digitalcommons.library.tmc.edu/dissertations/AAI1602765

Share

COinS