Language
English
Publication Date
1-1-2025
Journal
PeerJ
DOI
10.7717/peerj.19504
PMID
40444286
PMCID
PMC12121622
PubMedCentral® Posted Date
5-26-2025
PubMedCentral® Full Text Version
Post-print
Abstract
Background: Colon cancer screening studies are needed for the early detection of colorectal polyps to reduce the risk of colorectal cancer. Unfortunately, the data generated on colon polyps are typically analyzed in their dichotomized form and sometimes with standard count models, which leads to potentially inaccurate findings in research studies. A more appropriate approach for evaluating colon polyps is zero-inflated models, considering undetected existing polyps at colonoscopy screening.
Method: We demonstrated the application of the zero-inflated and hurdle models including zero-inflated Poisson (ZIP), zero-inflated robust Poisson (ZIRP), zero-inflated negative binomial (ZINB), zero-inflated generalized Poisson (ZIGP), zero hurdle Poisson (ZHP), and zero hurdle negative binomial (ZHNB) models, and compared them with standard approaches including logistic regression (LR), Poisson regression (PR), robust Poisson (RP), and negative binomial (NB) regression for the evaluation of colorectal polyps using datasets from two randomized studies and one observational study. We also facilitated a step-by-step approach for selecting appropriate models for analyzing polyp data.
Results: All datasets yielded a significant amount of no polyps and therefore inflated or hurdle models performed best over single distribution models. We showed that cap-assisted colonoscopy yielded significantly more colon polyps (risk ratio [RR] = 1.38; 95% confidence interval [CI] [1.05-1.81]) compared with the standard colonoscopy by using the ZIP analysis. However, these findings were missed by standard analytic methods, including LR (odds ratio [OR] = 0.90; 95% CI [0.59-1.37]), PR (RR = 1.14; 95% CI [0.93-1.41]), and NB (RR = 1.16; 95% CI [0.89-1.51]) for evaluating colon polyps. The standard approaches, such as LR, PR, RP, or NB regressions for analyzing polyp data, produced potentially inaccurate findings compared to zero-inflated models in all example datasets. Furthermore, simulation studies also confirmed the superiority of ZIRP over alternative models in a range of datasets differing from the case studies. ZIRP was found to be the optimal method for analyzing polyp data in randomized studies, while the ZINB/ZHNB model showed a better fit in an observational study.
Conclusion: We suggest colonoscopy studies should jointly use the polyp detection rate and polyp counts as the quality measure. Based on theoretical, empirical, and simulation considerations, we encourage analysts to utilize zero-inflated models for evaluating colorectal polyps in colonoscopy screening studies for proper clinical interpretation of data and accurate reporting of findings. A similar approach can also be used for analyzing other types of polyp counts in colonoscopy studies.
Keywords
Humans, Colonic Polyps, Colonoscopy, Early Detection of Cancer, Biostatistics, Models, Statistical, Colorectal Neoplasms, Poisson Distribution, Colonic Neoplasms, Colonoscopy studies, Polyps, Count data, Zero-inflated models, Regression analysis
Published Open-Access
yes
Recommended Citation
Dwivedi, Alok K; Elhanafi, Sherif E; Othman, Mohamed O; et al., "Zero-Inflated Models for the Evaluation of Colorectal Polyps in Colon Cancer Screening Studies-A Value-Based Biostatistics Practice" (2025). Faculty and Staff Publications. 4067.
https://digitalcommons.library.tmc.edu/baylor_docs/4067