Dissertations & Theses (Open Access)

Graduation Date

Summer 8-15-2018

Degree Name

Doctor of Philosophy (PhD)

School Name

The University of Texas School of Biomedical Informatics at Houston

Advisory Committee

Hua Xu, PhD


Extracting and encoding clinical information captured in unstructured clinical documents with standard medical terminologies is vital to enable secondary use of clinical data from practice. SNOMED CT is the most comprehensive medical ontology with broad types of concepts and detailed relationships and it has been widely used for many clinical applications. However, few studies have investigated the use of SNOMED CT in clinical information extraction.

In this dissertation research, we developed a fine-grained information model based on the SNOMED CT and built novel information extraction systems to recognize clinical entities and identify their relations, as well as to encode them to SNOMED CT concepts. Our evaluation shows that such ontology-based information extraction systems using SNOMED CT could achieve state-of-the-art performance, indicating its potential in clinical natural language processing.


SNOMED CT, natural language processing, information extraction, semantic types, Ontology-Based Information Extraction (OBIE)