Dynamic approach for electrocardiogram (ECG) signal and RNA-seq data analysis
Portable, Wearable and Wireless electrocardiogram (ECG) Systems have the potential to be used as point-of-care cardiovascular disease diagnostic systems. Such wearable and wireless ECG systems require automatic detection of cardiovascular disease. Even in the primary care, automation of ECG diagnostic systems will improve efficiency of ECG diagnosis and reduce the minimal training requirement of local healthcare workers. However, few fully automatic myocardial infarction (MI) disease detection algorithms have well been developed. This paper presents a novel automatic MI classification algorithm based on second order ordinary differential equations (ODE) with time varying coefficients which simultaneously captures morphological and dynamic feature of highly correlated ECG signals. By effectively estimating the unobserved state variables and the parameters of the second order ordinary differential equations, the accuracy of the classification was significantly improved. The estimated time varying coefficients of the second order ODE were used as an input to the support vector machine (SVM) for the MI classification. The proposed method was applied to the PTB diagnostic ECG database within Physionet. The overall sensitivity, specificity, and classification accuracy of 12 lead ECGs for MI binary classifications were 98.7%, 96.4% and 98.3%, respectively. We also found that even using one lead ECG signals, we can reach accuracy as high as 97%. Multiclass MI classification is a challenging task. The developed ODE model for 12 lead ECGs coupled with multiclass SVM reached 96.4% accuracy for classifying 5 subgroups of MI and the healthy controls. The second goal of this dissertation is to find the association of somatic mutation and gene expression. Understanding the transcriptome is essential for identifying the functional elements of the genome and revealing the mechanism of diseases. The current statistical methods for eQTL analysis are originally designed for microarray gene expression profiling technologies. The recently developed next-generation mRNA sequencing (RNA-seq) assay generates millions of short reads of mRNA or cDNA that are mapped to the genome and lead to a sequence of read counts at the millions of genomic positions. The classical methods for eQTL analysis, which only consider a single phenotype at a time, fail to utilize all information of transcripts. Whole genome RNA-seq eQTL analysis poses a significant challenge. To meet the challenge, we use second order ordinary equation (ODE) as a general framework for RNA-seq eQTL analysis taking expression variation at genomic positional level into account. The proposed method was applied to ovarian cancer TCGA dataset. We found several significant genes that were associated with somatic mutation. Previous researchers also detected most of the genes that were detected by the proposed method. However, the proposed method detected more significant genes. Moreover, the current approach takes the expression variation at genomic position into account. Hence, each curve of the gene expression was estimated by the second order inhomogeneous differential equation. Furthermore, almost all the genes detected by the current method were risk factors for various cancer types. We first use permutation test to find the association of gene expression and somatic mutation. Secondly, we assumed the normality assumption and compared the p-values from the permutation test and from the theoretical test. We found that the p-values from the permutation test and from the theoretical values were almost the same. Since permutation test is computationally very intensive, we used the theoretical approach to investigate the association.
Zewdie, Getie Abeje, "Dynamic approach for electrocardiogram (ECG) signal and RNA-seq data analysis" (2014). Texas Medical Center Dissertations (via ProQuest). AAI3665405.